DETECTION AND REMOVAL OF BINOCULAR LUSTER IN COMPRESSED 3D IMAGES Can Bal Ankit K. Jain Truong Q. Nguyen University of California, San Diego. ECE Dept. La Jolla, CA 92092 USA ABSTRACT Binocular luster is an extremely salient effect seen in 3D when an object in each stereo image exhibits a different contrast polarity relative to the background. The object appears to shimmer, a phenomenon seen in nature and on 3D displays, which the Human Visual System rapidly detects. This binocular luster is also induced by compression of stereo imagery where corresponding blocks are quantized to different values. In this paper, we discuss the psychovisual background of binocular luster, introduce the “shine” artifact induced by compression, and present an algorithm for detection and removal of shine from JPEG compressed stereo images while preserving the original bitrate. Index Terms— 3D, stereo, artifact, metric, compression. 1. INTRODUCTION Binocular luster is an effect seen when an object appears darker in one eye’s view and lighter in the other eye’s view, relative to the background [1]. The object, when fused in 3D, appears to have a shiny, lustrous appearance. This is a naturally occurring phenomenon, where the incident light from the sun illuminates an object which strongly reflects light into one eye but not the other. Binocular luster is primarily responsible for the shiny appearance of metallic objects such as cars [2]. The luster effect is also seen on 3D displays when contrast differences exist between the two images of a stereo pair. For instance, luster may occur as an annoying artifact in compressed stereo images, improperly synthesized virtual images, or as a result of any processing algorithm that does not specifically account for it. The detection of binocular luster by the human eye is very rapid, making the luster artifact extremely salient. In fact, the detection of binocular luster by the visual system is the most powerful process for any exclusively binocular task [2]. Thus, binocular luster must be accounted for in many areas of stereo image processing. However, to our knowledge, this paper is the first to report detection or processing methods for binocular luster. Our work has important implications for 3D saliency detection, compression, enhancement, quality assessment, and view synthesis. The paper is organized as follows: in Section 2, we discuss 3D artifact studies and some works in the field of Psychology; in Section 3, we discuss the cause of binocular luster with respect to compression; in Section 4, we introduce our algorithm for detecting luster and our algorithm to remove it; and we conclude in Section 5. 2. RELATED WORK Binocular luster has been long known in psychology, but remained relatively unstudied [2]. Recently, the authors in [3] measured luster as a function of duration, size, and luminance differences, and Ludwig et. al. [4] reported the temporal integration of binocular luster stimuli. However, to our knowledge, there has been only one previous study in the image processing community that recognizes binocular luster as an image artifact [5]. In this work, the authors investigate the effect of luster on the overall viewing comfort of 3D displays. They define luster in terms of a color difference between corresponding portions of a stereo image, and refer to it as “metallic luster”. They induce metallic luster by decreasing the color depth of the image, which coarsely quantizes the colors and causes rivalry between the two views. The results in [5] indicate that luster is one of the most important factors that impacts viewing comfort. Other work has been done in artifact analysis for 3D images and video, such as the effect of JPEG compression of stereo images on 3D image quality [6], and subjective tests for quality of 2D+depth compressed videos [7]. The effect of Multiview Video Coding (MVC) compression of views and depth maps is investigated in [8] specifically for view synthesis. Similarly, [9] analyzes the effect of compression on view synthesis. These works do not address binocular luster and would be enhanced by its consideration. 3. BINOCULAR LUSTER Human 3D vision is enabled by seeing slightly different views of the same scene. The so-called cyclopean image is then formed in the brain by fusing corresponding objects from the two views. However, it is possible to force the eyes to see two distinct images, which results in a sensation called “binocular rivalry” [1]. Under this condition, the viewer perceives the two images alternating between one another. One such stimulus occurs when the fused regions are bounded by contours of opposite contrast, or when the two views differ substantially in luminance [1, 3]. When this happens, the rivalry produces a region of variable brightness, and this effect is known as “binocular luster” in the psychology literature. An illustration demonstrating this effect is shown in Fig. 1. In this example, the smaller square is lighter than the background in both of the views and no rivalry occurs. However, the larger squares have different contrast polarities in each view, which causes it to appear as if it is shining. nances in increments of the JPEG quantization step size, we ensure that the processing changes are not erased by quantization and will preserve the bitrate. For our experiments, we use the Middlebury Stereo Database [10]. 4.1. Detection Our shine artifact detection algorithm can be used for visual quality assessment of stereo images as well as processing the artifact. The block diagram of our detection algorithm is shown in Fig. 2. Fig. 1. Illustration of binocular luster. In 3D image processing, binocular luster may appear as a compression or processing artifact. In block-based compression, corresponding blocks in a stereo pair can sometimes be quantized to different levels. Depending on the surrounding regions of these blocks, they may have opposing contrast polarities in each view which causes binocular luster. In view synthesis, when occluded regions are errroneously filled, the luster effect can be observed when the synthesized views are fused with one another. These mismatches are immediately noticeable to the viewer; it literally lights up, and thus we call it the “shine artifact”. The shine artifact due to compression of 3D images occurs under coarse compression, when blocking artifacts start to appear. In this paper, we focus on binocular luster caused by JPEG compression of grayscale stereo images and introduce novel algorithms for detecting and treating the shine artifact. The implementation of such an algorithm is made difficult by the fact that luster has been investigated on uniform surrounds, whereas in natural images this is not usually the case. It is still an open question whether luster is due to local contrast, or the space-averaged contrast with the surround. Further, how people perceive contrast in color is unknown, which is why we treat grayscale images in this study. Nevertheless, we employ the contrast polarity model in our algorithms and achieve good performance. 4. DETECTION AND REMOVAL OF SHINE ARTIFACT We first present an algorithm to detect the shine artifact in JPEG images which is based on the low level psychological understanding of binocular luster. Then, a processing algorithm is presented to reduce the shine artifact by adjusting the luminances of certain blocks in one image. The motivation is to detect these blocks at the encoder side, and process the offending blocks before transmission. This allows us to use the disparity map for detection, and does not require any processing at the decoder side. By adjusting the block lumi- Disparity Map Compressed Left View Compressed Right View Block Level Disparity Estimation Block Level View Synthesis Contrast Polarity Calculation Shine Detection Median Filter Find Opposing Polarities Fig. 2. Block diagram of shine artifact detection algorithm. The algorithm is based on the contrast polarity model that is used to explain the binocular luster and therefore requires the knowledge of the block correspondences between the views of a stereo pair. The disparity map corresponding to one of the stereo images is converted to a block-level disparity map by dividing it into blocks and taking the median of all the disparity values within a block. Using the block-level disparity map, the blocks are mapped onto the other view. The contrast polarities are then calculated for the synthesized view and the corresponding original view along the block boundaries. For horizontally neighboring blocks, each row of pixels (JPEG uses 8×8 blocks, which gives out 16pixel long rows) are masked with f [k], the filter given in (1), and the resulting coefficients are added together to yield the contrast polarity for that row. For vertically neighboring blocks, the same process is repeated with f [k]T . |k−β| if 0 ≤ k ≤ β α |k−β| f [k] = −α (1) if −β ≤ k < 0 0 otherwise The design of f [k] is based on the observation that pixels along block boundaries contribute more to the shine artifact. We set α = 0.8 empirically, and chose β = 4 in order to not use the same pixel values toward calculating contrast polarities of different boundaries without disregarding any information available to the detector. The contrast polarity calculations around the block boundaries form the contrast polarity maps. The sign of each value in the map indicates the polarity and the magnitude indicates the strength of contrast at that pixel location. We compare the contrast polarity maps of the synthesized and corresponding original view to find the boundaries that have opposing (a) Right compressed view. (b) Left compressed view. (c) Right compressed view with shine detection. Fig. 3. Shine detection using the algorithm in Fig. 2 (Figs. 3(a)–3(b) are set up for crosseye viewing). polarities. We form a detection map, which is the difference between the contrast polarity map wherever the contrast is opposite and zero elsewhere. Finally, we apply a spatial 1D median filter of length 9 (block size + 1) along the vertical and horizontal directions of the detection map to remove false detections. This ensures the detected shine to be consistent along the boundaries of a block. Results show that the proposed algorithm can detect blocks that possess significant amount of shine and yields very few false detections when used with lightly compressed images. A detection example is shown in Fig. 3. We have circled some visually apparent shining blocks on the compressed images for comparison with the shine detection. 4.2. Removal Using the result of Section 4.1, we can process the stereo images in order to remove the shine artifacts. Based on the contrast polarity model, we need to only process one of the two stereo images to ensure that the contrast polarities between corresponding blocks are not in opposition. As a convention, we have chosen to generate the detections for and to process the right image. The algorithm works in an iterative manner; the block diagram of one iteration is depicted in Fig. 4. Input Image Shine Detector Output Image Comparison Block Selection Luminance Adjustment Shine Detector Fig. 4. Block diagram of an iteration of the shine artifact removal algorithm. To initialize the algorithm, the detected shine on the right stereo image is used to decide which blocks to process. The luminance of these blocks are adjusted to remove the shine. However, for weaker detections where it is indeterminate of how to adjust the luminance, the shine is not always removed. Therefore, to check if the artifact has been removed, the processed image is passed back into the detector. In the final stage, the detections on the processed image are compared to the initial detections, and for the blocks which do not exhibit less shine, luminance values get adjusted back to their original values. To choose which blocks to process, a boolean map of detections is generated around the block boundaries by finding the nonzero values of the detection map. Next, the number of non-zero elements of the boolean map are calculated for each block, providing a measure of the strength of detection. If this number exceeds a threshold, we process that block. The threshold is chosen by considering a strong detection, which would have 1’s along three border edges of the 8 × 8 block. The sum of the border would be 22, but to account for imperfect detections, we lower the threshold to 18. Matching the contrast polarity of corresponding blocks involves adjusting only the luminance value, or the DC coefficient, of the detected blocks. Since JPEG quantizes DCT coefficients using a fixed quantization table, we calculate the quantization step size for the DC coefficient. The DC coefficient of a block to be processed is increased or decreased one step size per iteration, which allows us to gradually converge to the best luminance value. In addition, the altered DC values do not change the overall bitrate significantly since they are increased for some blocks and decreased for others. The decision of the adjustment direction is sometimes ambiguous due to weak detections or cases the model does not explain. Therefore, we always increase or always decrease in an iteration, and alternate the direction in between the iterations. Since we check if an adjustment makes the shine better after each iteration this scheme does not have an adverse effect on the final result. The artifact removal process stops if an iteration does not change the total amount of shine in the image, or it exceeds 20 iterations. In our tests, all the test images converged before 8 iterations. Our method using this model gives visually pleasing results with minimal change to the input image. Fig. 5 shows the compressed left and processed right images for the detec- (a) Right processed view. (b) Left compressed view. (c) Right compressed view with shine detection. Fig. 5. Compressed stereo pair, processed for shine removal (Figs. 5(a)–5(b) set up for crosseye viewing). tions in Fig. 3(c). For more results of the detection and the removal algorithms, please refer to http://videoprocessing.ucsd.edu/~canbal. 5. CONCLUSION In this paper we present an algorithm for detecting and removing the shine artifact from JPEG compressed stereo images without introducing additional bitrate. For this we focused on calculating shine along the block boundaries and removing the artifact by adjusting the luminances of the blocks. Our method removes the artifacts with significant shine without a major effect on the bitrate of the encoded JPEG images. The shine artifact is not limited to compression and may appear a result of processing, enhancement or view synthesis. In these cases, it is not trivial to detect shine automatically as the effect of texture and color on binocular luster is still not well understood. Since it is not known how the eye interprets contrast across a nonuniform background, a better model of contrast would improve detection and processing results. Therefore, one of our plans for the future is to conduct low-level psychological tests to develop a better model of binocular luster. We aim to extend this work to color images and H.264 compressed videos, and investigate how motion effects the visibility of luster. In addition, we plan to generate an image/video dataset that include various types and levels of shine. This database will be paired with subjective assessment of the images for which blocks or regions exhibit shine. We also plan to use our findings in 3D metric development and in developing a model for saliency in stereo. 6. REFERENCES [1] I. P. Howard, Seeing in Depth. I. Porteous, 2002. [2] C. Tyler, “Binocular vision,” in Duane’s Foundations of Clinical Ophthalmology, vol. 2, W. Tasman and E. Jaeger, Eds. Philadelphia: J.B. Lippincott Co., 2004. [3] M. Formankiewicz and J. Mollon, “The psychophysics of detecting binocular discrepancies of luminance,” Vision Research, vol. 49, no. 15, pp. 1929–1938, 2009. [4] I. Ludwig, W. Pieper, and H. Lachnit, “Temporal integration of monocular images separated in time: Stereopsis, stereoacuity, and binocular luster,” Perception & Psychophysics, vol. 69, pp. 92–102, 2007. [5] F. L. Kooi and A. Toet, “Visual comfort of binocular and 3-d displays,” in Encyclopedia of Optical Engineering, 2004, pp. 1–14. [6] P. Seuntiens, L. Meesters, and W. Ijsselsteijn, “Perceived quality of compressed stereoscopic images: Effects of symmetric and asymmetric jpeg coding and camera separation,” ACM Trans. Appl. Percept., vol. 3, no. 2, pp. 95–109, 2009. [7] C. Hewage, S. Worrall, S. Dogan, S. Villette, and A. Kondoz, “Quality evaluation of color plus depth map-based stereoscopic video,” Selected Topics in Signal Processing, IEEE Journal of, vol. 3, no. 2, pp. 304 –318, april 2009. [8] K. Klimaszewski, K. Wegner, and M. Domanski, “Distortions of synthesized views caused by compression of views and depth maps,” in 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video, 2009, 4-6 2009, pp. 1 –4. [9] P. Merkle, Y. Morvan, A. Smolic, D. Farin, K. Muller, P. de With, and T. Wiegand, “The effect of depth compression on multiview rendering quality,” in 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video, 2008, 28-30 2008, pp. 245 –248. [10] D. Scharstein and C. Pal, “Learning conditional random fields for stereo,” in IEEE Conference on Computer Vision and Pattern Recognition, 2007, June 2007, pp. 1–8.
© Copyright 2026 Paperzz