F-MAD: A Feature-Based Extension of the Most Apparent Distortion Algorithm for Image Quality Assessment Punit Singh and Damon M. Chandler Laboratory of Computational Perception and Image Quality, School of Electrical and Computer Engineering, Oklahoma State University, Stillwater, OK 74078 USA ABSTRACT In this paper, we describe the results of a study designed to investigate the effectiveness of peak signal-to-noise ratio (PSNR) as a quality estimator when measured in various feature domains. Although PSNR is well known to be a poor predictor of image quality, PSNR has been shown be quite effective for additive, pixel-based distortions. We hypothesized that PSNR might also be effective for other types of distortions which induce changes to other visual features, as long as PSNR is measured between local measures of such features. Given a reference and distorted image, five feature maps are measured for each image (lightness distance, color distance, contrast, edge strength, and sharpness). We describe a variant of PSNR in which quality is estimated based on the extent to which these feature maps for the reference image differ from the corresponding maps for the distorted image. We demonstrate how this feature-based approach can lead to improved estimators of image quality. 1. INTRODUCTION A crucial requirement for any system that processes images is a means of assessing the impacts of such processing on the visual quality of the resulting images. Over the last several decades, numerous algorithms for image quality assessment (IQA) have been developed to meet this requirement. IQA algorithms aim to predict the quality of an image in a manner that agrees with quality as judged by human subjects. Here, we specifically focus on full-reference IQA algorithms which require both a reference image and a distorted image. The simplest approach to full-reference IQA is to measure local pixelwise differences, and then to collapse these local measurements into a scalar which represents the overall quality. The mean-squared error (MSE) and its log-based counterpart, peak signal-to-noise ratio (PSNR), were the earliest and simplest measures of local pixelwise differences. To improve predictive performance, variants of MSE/PSNR have been measured in the luminance domain,1 with frequency weighting based on the the human contrast sensitivity function (see, e.g., Ref. 2), and with further adjustments for other low-level properties of the human visual system (HVS) (e.g., Ref. 3). More recent and complete IQA algorithms have employed a wide variety of approaches. Numerous IQA algorithms have been designed based on computational models of the HVS (e.g., Refs. 2, 4–9). Numerous IQA algorithms have also been designed based on structural similarity (e.g., Refs. 10, 11). Other IQA algorithms have been designed based on various statistical and information-theoretic-based approaches (e.g., Refs. 12, 13), based on machine-learning (e.g., Refs. 14, 15), and based on many other techniques (see Ref. 16 for a review). All of the aforementioned IQA approaches have been shown to outperform PSNR when tested across images and distortion-types from various IQA databases. However, one important observation when examining the performances of these IQA algorithm vs. PSNR is that the latter is still quite competitive (and can even outperform most IQA algorithms) on certain types of distortions, most notably additive noise. For example, on the TID database,17 PSNR outperforms the vast majority of IQA algorithms on most additive noise types (white grayscale noise, white color noise, correlated noise, impulse noise, high-frequency noise, etc.). Thus, PSNR, which is measured between the pixel values of the reference vs. distorted images, appears to be quite effective at capturing quality differences when the changes are perceived as pixel-based degradation. Following this argument, it would seem that PSNR might also be effective when measured between feature values of the reference vs. distorted images, when the changes are perceived as degradations to the corresponding features (e.g., degradation of perceived contrast, perceived sharpness, perceived edge clarity, etc.). P. S.: E-mail: [email protected]; D.M.C.: E-mail: [email protected] In this paper, we describe the results of a study designed to investigate the effectiveness of PSNR as a quality estimator when PSNR is measured in various feature domains. We specifically investigated measuring PSNR between the same feature maps used in our algorithm for detecting main subjects in images.18 Given a reference and distorted image, we measure, for each block in each image, five low-level features: (1) lightness distance, (2) color distance, (3) contrast, (4) edge strength, and (5) sharpness. These block-based measures thus result in five feature maps for the reference image, and five feature maps for the distorted image. From these feature maps, quality is estimated based on the extent to which the feature maps for the reference image differ from the corresponding maps for the distorted image. We specifically present a measure of quality—F-PSNR—in which the differences between the feature maps are quantified based on a combination of the average PSNR and the average Pearson correlation coefficient. We also describe a straightforward technique of integrating F-PSNR into the MAD (Most Apparent Distortion) IQA algorithm,9 resulting in what we have termed F-MAD. As we will demonstrate, this feature-based approach can lead to improved estimates of quality. This paper is organized as follows. Section 2 describes the feature maps and how they are used to estimate quality (F-PSNR and F-MAD). Section 3 analyzes the performances of these feature-based IQA measures in predicting subjective ratings of quality. General conclusions are provided in Section 4. 2. ALGORITHM In this section, we first provide details of the feature maps (Section 2.1), and then we describe how these maps are used in the F-PSNR and F-MAD measures of quality (Sections 2.2 and 2.3). 2.1. Feature Maps Given a reference and distorted image, five feature maps are computed for each image: (1) a map of local lightness distance (the distance between the average lightness of a region and the average lightness of the entire image); (2) a map of local color distance; (3) a map of local luminance contrast; (4) a map of local edge-strength; and (5) a map of local sharpness. We have previously shown that these features maps are effective for detecting main subjects in images.18 Here, we argue that these maps can also be effective for quality assessment. Let X denote a (reference or distorted) image, and let x denote an 8x8 block of X with 50% overlap between neighboring blocks. Let fi (x), i ∈ [1, 5], denote the ith feature value measured for x. From all fi (x), x ∈ X, we form the ith feature map, which we denote as fi (X). 2.1.1. Lightness and Color Distance Let f1 (x) denote the Euclidean distance between the average lightness of block x and the average lightness of the entire image. Let f2 (x) denote the Euclidean distance between the average color of block x and the average color of the entire image. These two features are given by: (1) f1 (x) = L̄∗x − L̄∗X , f2 (x) = 2 2 (ā∗x − ā∗X ) + b̄∗x − b̄∗X , (2) where L̄∗ , ā∗ , b̄∗ denote the average L∗ , a∗ , b∗ measured in the CIE 1976 (L∗ , a∗ , b∗ ) color space (CIELAB). See Ref. 18 for details on how RGB values are converted to (L∗ , a∗ , b∗ ) values. The second and third rows of Figure 1 show lightness-distance maps f1 (X) and color-distance maps f2 (X), respectively, for various images. Input Image Lightness Distance Color Distance Contrast Edge Strength Sharpness Monument Fisher Sparrow Swarm NativeUS Figure 1. Example images and their feature maps. Images in first row are select reference images from the CSIQ database.19 The second through sixth rows show maps of lightness distance, color distance, contrast, edge strength, and sharpness. 2.1.2. Contrast Local contrast can also be an important factor which influences an image’s visual appearance. To measure this, we first convert the image into the luminance domain. Then, the root mean square (RMS) contrast of each block is given by the ratio of the standard deviation of luminances to the mean luminance of the respective block. The result is a map in which each value represents local RMS contrast. Specifically, let f3 (x) denote the RMS contrast of block x. In order to compute f3 (x), we first convert the image X into a grayscale image Xg via Xg = 0.299R + 0.587G + 0.114B. Let xg denote the corresponding block in Xg . Let l(x) = (b + kxg )γ denotes the luminance-valued block, with b = 0.7297, k = 0.0376 and γ = 2.2 assuming sRGB display conditions. The quantity f3 (x) is then computed via: f3 (x) = σl(x) /μl(x) , 0, μl(x) > 0, μl(x) = 0, (3) where σl(x) and μl(x) denote the standard deviation and the mean of l(x), respectively. The fourth row of Figure 1 shows contrast maps f3 (X) for various images. 2.1.3. Edge Strength To quantify similarity between object boundaries, we use maps of local edge strength. First, edges are detected by using Robert’s edge detector.20 Then the edge strength of each block is computed by averaging the number of detected edge pixels within that block. The result is a map in which each value represents local edge strength. Specifically, let f4 (x) denote the edge strength of block x. Let E(X) denote the binary edge map computed by running Roberts edge detector on X. The feature f4 (x) is then given by: f4 (x) = μE(x) = 1 ej , m2 j (4) where E(x) is the corresponding block of x in E(X) and ej is a pixel of E(x). The fifth row of Figure 1 shows edge-strength maps f4 (X) for various images. 2.1.4. Sharpness In general the sharper an image the better is its quality. If the image is blurred, it is difficult to distinguish between neighboring objects; blurring also reduces the ability to visually recognize objects. Thus, sharpness can potentially be a useful feature for estimating image quality. Let f5 (X) denote the sharpness map for image X. For measuring local sharpness, we employ our own S3 sharpness estimator21 in which local sharpness is measured in both the frequency domain and the spatial domain. In the frequency domain, the image is divided into 32x32 pixel blocks with 75% overlap. The slope of the power spectrum averaged across all orientations serves as the spectral sharpness measure. In the spatial domain, the image is divided into 8x8 pixel blocks, and then a measure of the local total variation serves as the spatial sharpness measure. The two sharpness measures are then combined via a geometric mean. The result is a map in which each value represents local sharpness. The sixth row of Figure 1 shows sharpness maps f5 (X) for various images. 2.2. PSNR and Correlation Between Feature Maps Given the five feature maps, we estimate quality based on the extent to which the feature maps of the distorted image differ from the feature maps of the reference image. We employ PSNR and Pearson correlation coefficient to quantify the overall difference between each pair of maps (distorted image’s map vs. reference image’s map). Let F-PSNR denote this feature-based quality measure. A block diagram of the F-PSNR computation is shown in Figure 2. Let Xr and Xd denote the reference and distorted images, respectively. The PSNR between each feature map is given by R2 P SN R (fi (Xr ), fi (Xd )) = 10 log10 , (5) M SE where fi (Xr ) and fi (Xd ) denote the ith feature map for images Xr and Xd , respectively; and where R denotes peak value of the signal, and M SE denotes the mean-squared error between fi (Xr ) and fi (Xd ). Corr Contrast Graysale, luminance, or L* (see text for details) PSNR P Corr Sharpness PSNR x x Distorted image Edge Strength Corr P PSNR Reference image Corr C b* Luminance Distance PSNR P Corr C Color Distance a* PSNR P x + F-PSNR x x Figure 2. Block diagram of the F-PSNR quality measure. We also compute the linear correlation coefficient between the corresponding maps from the two images, given by fi (Xd )n1 ,n2 − fi (Xd ) n1 n2 fi (Xr )n1 ,n2 − fi (Xr ) , (6) CORR (fi (Xr ), fi (Xd )) = 2 2 n1 n2 fi (Xr )n1 ,n2 − fi (Xr ) n1 n2 fi (Xd )n1 ,n2 − fi (Xd ) where fi (Xr )n1 ,n2 and fi (Xd )n1 ,n2 denote the (n1 , n2 ) element of fi (Xr ) and fi (Xd ), respectively; and where fi (Xr ) and fi (Xd ) denote the mean of fi (Xr ) and fi (Xd ), respectively. Finally, F-PSNR is computed by multiplying the correlation coefficients with the corresponding PSNRs, and then averaging the products: 5 F-PSNR = 1 P SN R (fi (Xr ), fi (Xd )) × CORR (fi (Xr ), fi (Xd )) . 5 i=1 (7) As we will demonstrate in Section 3, F-PSNR on its own performs quite competitively with current stateof-the-art IQA algorithms in predicting quality. However, additional improvements in predictive performance can potentially be gained by combining F-PSNR with an existing IQA algorithm. In the following section, we describe a combination of F-PSNR and the MAD IQA algorithm.9 2.3. F-MAD: Augmenting MAD with F-PSNR To investigate the effectiveness of F-PSNR as a supplement to existing IQA algorithms, we augmented the MAD (Most Apparent Distortion)9 algorithm with F-PSNR. MAD was one of the first algorithms to demonstrate that quality can be predicted by modeling two strategies employed by the HVS, and by adapting these strategies based on the amount of distortion. For high-quality images, in which the distortion is less noticeable, the image is most apparent, and thus the HVS attempts to look past the image and look for the distortion—a detection-based strategy. For low-quality images, the distortion is most apparent, and thus the HVS attempts to look past the distortion and look for the image’s subject matter—an appearance-based strategy. In MAD, two main stages are employed: (1) a detection-based stage, which computes the perceived distortion due to visual detection of distortions ddetect ; and (2) an appearance-based stage, which computes the perceived distortion due to visual appearance changes dappear . The detection-based stage of MAD computes ddetect by using a masking-weighted block-based mean square error which is computed in the lightness domain. The appearancebased stage of MAD computes dappear by computing the average difference between the block-based log-Gabor statistics of the original image to those of the distorted image. To augment MAD with F-PSNR, we employ the following weighted geometric mean: F-MAD = (ddetect )α (dappear )β (F-PSNR)−γ (8) where ddetect and dappear denote the outputs of MAD’s detection-based and appearance-based stages, respectively. The parameters β and γ are given by β = (1−α) and γ = 1 − α − β, where α is the blending parameter computed 2 in the original MAD algorithm: 1 (9) α= (1 + β1 (ddetect )β2 ) where β1 = 0.32 and β2 = 0.132. As argued in Ref. 9, Equation (9) was designed to give greater weight to ddetect for high-quality images and greater weight to dappear for low-quality images. Here, because F-PSNR does not take into account visual masking, we chose β = (1−α) and γ = 1 − α − β so that F-PSNR supplements MAD’s appearance-based stage 2 rather than its detection-based stage. 3. RESULTS We applied F-PSNR and F-MAD to two publicly available databases of subjective image quality: LIVE22 and CSIQ.19 We compared F-PSNR and F-MAD with normal PSNR and five other modern full-reference IQA algorithms for which code is publicly available: SSIM,10 MS-SSIM,11 VIF,12 VSNR,7 and MAD.9 Four measures of performance were employed: Pearson correlation coefficient (CC), Spearman rank order correlation coefficient (SROCC), outlier ratio (OR), and outlier distance (OD). For all IQA algorithms, a four-parameter sigmoid was applied before computing CC, OR, and OD to compensate for nonlinear relations between the predictions and subjective scores. Table 1 lists the resulting CC, SROCC, OR, and OD of each algorithm on each database. Notice from Table 1 is that F-PSNR outperforms PSNR, SSIM, MS-SSIM, and VSNR. In terms of CC, F-PSNR yields values of of 0.949 and 0.931 on LIVE and CSIQ, respectively. This finding suggests that changes to the feature maps caused by the distortions can be an effective proxy for estimating quality. For F-MAD, the results in Table 1 demonstrate that the combination of F-PSNR and MAD may or may not lead to improved predictions over MAD alone. In terms of CC, F-MAD yields CC values of 0.970 and 0.962 on LIVE and CSIQ, respectively; and MAD alone yields 0.968 and 0.950 on these databases. The improvement on the LIVE database is negligible; however, the improvement on the CSIQ database is significant. (For comparison, the next overall best performer, VIF, yields CC values of 0.960 and 0.925 on the respective databases.) Although F-PSNR on its own shows promise, there is clearly a need to further research proper techniques of combining F-PSNR with existing IQA algorithms. Table 1. Performances of F-MAD and other quality assessment algorithms on images from the LIVE and CSIQ databases. The results in the Average rows denote averages weighted by the number of images in the databases. The best performances are bolded. CC SROCC OR OD LIVE CSIQ Average LIVE CSIQ Average LIVE CSIQ Average LIVE CSIQ PSNR SSIM MSSIM VSNR VIF MAD F-PSNR F-MAD 0.871 0.800 0.835 0.876 0.806 0.841 0.682 0.343 0.512 4943 3178 0.938 0.815 0.876 0.947 0.837 0.892 0.592 0.335 0.463 2814 2896 0.933 0.897 0.915 0.944 0.914 0.929 0.619 0.245 0.432 2960 1528 0.923 0.800 0.862 0.928 0.811 0.869 0.588 0.311 0.449 3247 3325 0.960 0.925 0.942 0.963 0.919 0.941 0.546 0.226 0.386 1890 1218 0.968 0.950 0.959 0.968 0.947 0.957 0.415 0.180 0.297 1370 626 0.949 0.931 0.940 0.953 0.929 0.941 0.557 0.216 0.387 2331 936 0.970 0.962 0.966 0.970 0.956 0.963 0.398 0.170 0.284 1282 579 4. CONCLUSIONS This paper described the results of a study designed to investigate the effectiveness of PSNR as a quality estimator when measured between feature maps for the reference and distorted images. Given a reference and distorted image, five feature maps are measured for each image (lightness distance, color distance, contrast, edge strength, and sharpness). Quality is then estimated based on the extent to which these feature maps for the reference image differ from the corresponding maps for the distorted image. We demonstrated how this feature-map-based approach (F-PSNR) can yield a competitive IQA strategy, and how it can be used to augment and improve an existing IQA algorithm (F-MAD). 5. ACKNOWLEDGMENTS This material is based upon work supported by, or in part by, the National Science Foundation Award 0917014. REFERENCES 1. B. Moulden, F. A. A. Kingdom, and L. F. Gatley, “The standard deviation of luminance as a metric for contrast in random-dot images,” Perception 19, pp. 79–101, 1990. 2. N. Damera-Venkata, T. D. Kite, W. S. Geisler, B. L. Evans, and A. C. Bovik, “Image quality assessment based on a degradation model,” IEEE Transactions on Image Processing 9, 2000. 3. K. Egiazarian, J. Astola, N. Ponomarenko, V. Lukin, F. Battisti, and M. Carli, “A NEW FULLREFERENCE QUALITY METRICS BASED ON HVS,” in Proceedings of the Second International Workshop on Video Processing and Quality Metrics, (Scottsdale, AZ USA). 4. P. LeCallet, A. Saadane, and D. Barba, “Frequency and spatial pooling of visual differences for still image quality assessment,” Proc. SPIE Human Vision and Electronic Imaging V 3959, pp. 595–603, 2000. 5. “JNDMetrix technology.” Sarnoff Corporation. 6. A. Ninassi, O. Le Meur, P. Le Callet, and D. Barba, “Which semi-local visual masking model for wavelet based image quality metric?,” in Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on, pp. 1180–1183, Oct. 2008. 7. D. M. Chandler and S. S. Hemai, “Vsnr: A wavelet-based visual signal-to-noise ratio for natural images,” IEEE Transactions on Image Processing 16(9), pp. 2284–2298, 2007. 8. V. Laparra, J. M. noz Marı́, and J. Malo, “Divisive normalization image quality metric revisited,” J. Opt. Soc. Am. A 27, pp. 852–864, Apr. 2010. 9. E. C. Larson and D. M. Chandler, “Most apparent distortion: full-reference image quality assessment and the role of strategy,” Journal of Electronic Imaging 19(1), p. 011006, 2010. 10. Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Transactions on Image Processing 13, pp. 600–612, 2004. 11. Z. Wang, E. Simoncelli, and A. Bovik, “Multiscale structural similarity for image quality assessment,” in Conference Record of the Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, 2, pp. 1398–1402, Nov. 2003. 12. H. R. Sheikh and A. C. Bovik, “Image information and visual quality,” IEEE Transactions on Image Processing 15(2), pp. 430–444, 2006. 13. A. Shnayderman, A. Gusev, and A. M. Eskicioglu, “An SVD-based grayscale image quality measure for local and global assessment,” IEEE Transactions on Image Processing 15(2), pp. 422–429, 2006. 14. M. Liu and X. Yang, “A new image quality approach based on decision fusion,” Fuzzy Systems and Knowledge Discovery, Fourth International Conference on 4, pp. 10–14, 2008. 15. P. Peng and Z. Li, “Image quality assessment based on distortion-aware decision fusion,” in Proceedings of the Second Sino-foreign-interchange conference on Intelligent Science and Intelligent Data Engineering, pp. 644–651, 2012. 16. D. M. Chandler, “Seven challenges in image quality assessment: Past, present, and future research,” ISRN Signal Processing , 2012. in press. 17. N. Ponomarenko, V. Lukin, A. Zelensky, K. Egiazarian, M. Carli, and F. Battisti, “Tid2008-a database for evaluation of full-reference visual quality assessment metrics,” Advances of Modern Radioelectronics 10, pp. 30–45, 2009. 18. C. Vu and D. M. Chandler, “Main subject detection via adaptive feature refinement,” Journal of Electronic Imaging 20, Mar. 2011. 19. E. C. Larson and D. M. Chandler, “Categorical subjective image quality CSIQ database,” 2009. 20. L. G. Roberts, “Machine perception of three-dimensional solids,” Optical and Electrooptical Information Processing , MIT Press, Cambridge, MA, 1965. 21. C. T. Vu, T. D. Phan, and D. M. Chandler, “S3: A spectral and spatial measure of local perceived sharpness in natural images,” Trans. Img. Proc. 21, pp. 934–945, Mar. 2012. 22. H. Sheikh, Z.Wang, L. Cormack, and A. Bovik, “LIVE image quality assessment database Release 2..”
© Copyright 2025 Paperzz