IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXXX 200X 1 Optimization of the Deblocking filter in H.264 codec for real time implementation Hitesh Yadav, Student Member IEEE and K. R. Rao, Member, IEEE Abstract— Blocking artifacts are visible in the decoded frames of most video coding standards at low bit rate coding. Latest video coding standard H.264/AVC uses an in-loop deblocking filter to remove the blocking artifacts. The main drawback of this filter is its high implementation complexity. In this paper, we propose an in-loop deblocking filter to remove the blocking artifacts. In the proposed method, the maximum and minimum values among the six pixels across an edge are computed to decide whether the pixels of the block should be filtered or not. For intra frames, again the block is classified as smooth or mildly textured region. Depending on the classification of the block, the appropriate filter is applied to that block. The main advantage of the proposed method is its low complexity compared to JM 9.8 (H.264 Software). A performance comparison of the proposed method and the current method is presented. Index Terms— Deblocking filter, Post filter, Loop filter, H.264 standard. EDICS Category: IMD-CODE I. INTRODUCTION Signal source compression methods and coding bit rates normally influence the perceptual quality of compressed images and video [1]. In general, the less the bit rates the severe the coding artifacts manifest in the reconstructed video. Lower bit rates are desirable in many applications like video streaming because of the channel bandwidth constraints. The block discrete cosine transform (BDCT) based coding scheme introduces blocking artifacts in flat regions and ringing artifacts along object edges at low bit rates [1]. Deblocking filters are used to remove the blocking artifacts in the decoded video. Although the deblocking filters improve the objective and subjective qualities of output video frames, they are usually computationally intensive. There are number of deblocking algorithms proposed for reducing the block artifacts in BDCT based compressed images with minimal smoothing of true edges. They can be classified into three key categories: Projection onto convex Manuscript received July xx, 2006; revised Xxxxx xx, 20xx Hitesh Yadav and Dr. K. R. Rao are with the Electrical Engineering Department, University of Texas at Arlington, TX 76010 USA (e-mail: [email protected], [email protected]) sets (POCS), weighted sum of pixels across block boundaries, and adaptive filters. POCS based algorithm [2] iteratively projects back and forth between two sets on the entire picture. Its relative computation and implementation complexity is high compared to other two algorithms. It gives best quality visually with most of the video compared to other two methods. The weighted sum based algorithms [3] computation complexity is high compared to the adaptive algorithms. As adaptive algorithms [4] computational complexity is low they are preferred algorithms for real time implementation. H.264/AVC uses an adaptive in-loop deblocking filter to remove the blocking artifacts visible in decoded frames at low bit rate coding. The main drawback of this deblocking filter is its implementation complexity. Analysis of run time profiles of decoder sub-functions indicates that the deblocking filter process in H.264 is the most computationally intensive part [5]. Though recently efficient techniques to reduce the implementation complexity have been proposed [6]-[8], the complexity still cannot be reduced significantly because of the flow of algorithm itself. The program code [9] includes extensive conditional branching. This makes codes unsuitable for deeply pipelined processor and ASIC implementation. In addition, this program code exposes little parallelism. Hence this code is unsuitable for VLIW processors, which are otherwise well suited to video encoding/decoding applications. As we can see from the above, H.264/AVC has high implementation complexity. In this paper, we present a simpler algorithm for the deblocking filter which reduces the implementation complexity while maintaining the perceptual quality of the existing deblocking algorithm. In section 2, we present the algorithm for both inter and intra frames. In section 3, the results obtained from the proposed algorithm and the one obtained from the JM reference software [9] are discussed. II. PROPOSED ALGORITHM The blocking artifacts are visible in both intra and inter frames. The basic operation of the deblocking filter is as follows: The deblocking filter is applied to all the edges of 4x4 pixels blocks in each macroblock except to the edges on the boundary of a frame or a slice. For each block, vertical edges are filtered from left to right first, and then horizontal edges are filtered from top to bottom. The decoded process is repeated for all macroblocks in a frame. The one-dimensional view of a 4x4 block edge is shown in Fig. 1. Here q0, q1, q2, q3 represent the values loaded from the current 4x4 block and IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXXX 200X the p0, p1, p2, p3 represent the 4x4 block adjacent to the current 4x4 block. A. INTRA FRAMES Intra frames are more susceptible to blocking artifacts compared to inter frames [10]. The smooth blocks in an intra frame have more severe blocking artifacts compared to other blocks [10]. The proposed method as applied to intra frames is shown in Fig. 2. The first three blocks in Fig. 2. check for the conditions at the slice boundaries. The user sets these conditions. These three blocks are the same as used by the existing deblocking filter in H.264. The next step in Fig. 2 is to compute the maximum and minimum values among the six pixels across an edge (p2, p1, p0, q0, q1, q2) and then calculate the difference between the maximum and the minimum value. If this difference is greater than the QP of current block, then it is more likely to represent an edge and therefore should not be filtered. On the other hand, if the difference is less than the QP of the current block, filtering should be applied to that block to remove the blocking artifacts. The block in the above case most likely represents smoothly or mildly texture area. The next step in Fig. 2 is to find the difference between adjacent pixels of a 4x4 block edge. For example, the absolute difference between p3 and p2 is calculated and if that difference is less than a fixed threshold (in this case the threshold is set to two) then one is assigned to a variable diff. The above process is repeated for all the adjacent pixels across an edge (Fig. 1) in a 4x4 block. The variable strength denotes the sum of the variable diff across all pixels of a 4x4 block edge (Fig. 1). If the variable strength is greater than fixed threshold (in this case the threshold is set to four) then the block is most likely to be a smoothly textured section and strong filtering is applied to that block. Several thresholds are tried here before the thresholds have been set to the above values. On the other hand, if the variable strength is less than the above specified threshold, it is most likely to represent mildly textured area or high activity region and weak filtering is applied to that block. Here, strong filtering means applying a low pass filter to the three adjacent pixels (p2, p1, p0, q0, q1, q2) on either side of the boundary of a block and weak filtering means applying a low pass filter to the pixels (p0, q0) on either side of the boundary of a block. B. INTER FRAMES The exploitation of interframe redundancies relies on the transfer of previously coded information from motion compensated reference frames to the current predictive coded picture. Having filtered reference frames available for motion compensation reduces the blocking effect, but still some artifacts are present because the reference frames may not fit exactly at block boundaries. The proposed method is applied to inter frames as shown in Fig. 2. The next step in Fig. 2 is to compute the maximum and minimum values among the six pixels across an edge (p2, p1, p0, q0, q1, q2) and then calculate the difference between the maximum and minimum values. If this difference is less than QP then the current block most 2 likely represents a smooth or mildly textured area. If the MB is inter then apply low pass filter to the adjacent pixels on either side of the boundary of a block or MB. Otherwise, if the difference between the minimum and maximum values is greater than QP, the current block most likely represents an edge and therefore is not filtered. Fig. 1 4x4 block edge (vertical or horizontal) III. RESULTS FOR INTRA AND INTER FRAMES The simulations are done using JM 9.8/FRExt software [9]. High profile is used for simulations. The Y, Cb, Cr sampling mode used in the simulation is 4:2:0 format. The PSNR values for different test sequences [10] using the proposed method, JM 9.8 (H.264 software) [9] and reconstruction without loop filter are given in table 1. The reconstruction of Intra-frame with proposed method gives better PSNR values than the reconstruction without loop filter as shown in the table 1. Also, it gives similar PSNR values compared to the reconstruction with loop filter (table 1). Figures 3-4 show visually the removal of blocking artifacts using proposed method compared to the existing loop filter. Figure 5 shows the removal of blocking artifacts using JM 9.8 method. Table 2 shows that the proposed loop filter gives more bit savings compared to one without a loop filter. Though loop filter increases the visual quality of the decoded frames, for few sequences its PSNR value is less. The motion estimation is based on the SAD (sum of absolute differences). The filtered reference frames are used in motion compensation, so there is possibility that though visually the reference frame is superior but not in terms of SAD. So because of the above reasons for few sequences we may get more bits with a loop filter included in the codec. The PSNR values for different test sequences [11] using the proposed method, JM 9.8 (H.264 software) [9] and reconstruction without loop filter are given in table 3. The PSNR values are better for most sequences using the proposed method as compared to the JM method (H.264 software). IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXXX 200X Yes 3 No Deblocking disabled for all edges of the slice? Yes Deblocking disabled for all edges of the slice boundaries Is the edge at the slice boundary? No No Find the max. and min. values among 6 pixels across the vertical or horizontal edge. Compute the absolute difference between max. and min. value diff = abs (max. – min.) Yes No filtering No Is diff < QP Yes No Is MB Intra Apply weak filtering across the edge Yes Calculate the offsets for 7 pixel pairs in the horizontal or vertical edge Yes If offset < 2 Var =1 No Strength = strength + var Var =0 Yes Is strength > 4 Apply strong filtering No Apply weak filtering Fig. 2 Decision flow of filtering of pixels at the block edges IV. CONCLUSIONS The proposed method is able to reduce the blocking artifacts in the reconstructed video. It gives almost similar visual quality of the reconstructed video as compared to the one obtained from JM 9.8 (H.264 software) loop filter. It gives better PSNR values for most sequences especially in case of inter frames. At the same time, the proposed method requires less implementation complexity compared to the JM 9.8 (H.264 software) loop filter. That is because of the simple flow algorithm of the proposed method as compared to the JM 9.8 (H.264 software) loop filter. The proposed deblocking filter can be implemented in a real time system. By doing so, its exact reduction in implementation complexity compared to JM 9.8 can be determined. The deringing filter [10] can also be incorporated in the in-loop filter to see the visual improvement of the reconstructed video. The proposed method uses image enhancement techniques to reduce the artifacts in the reconstructed video. Image recovery techniques can also be explored to reduce the artifacts in H.264 decoded video. Also, the transforms that do not produce blocking artifacts as well as providing the benefits of integer DCT can be explored. TABLE 1. COMPARISON OF PSNR VALUES (dB) FOR DIFFERENT TEST SEQUENCES. Reconstructio n without Loop Filter JM 9.8 Reconstruction with Loop Filter JM 9.8 31.615 31.363 31.637 45 26.671 26.291 26.612 Silent 39 29.489 29.269 29.330 Container 39 29.539 29.383 29.493 Bridge 37 29.886 29.738 29.828 News 35 32.204 32.040 32.043 Container 45 25.718 25.553 25.710 Test clip (QCIF) QP Reconstruction with Proposed Method Foreman 37 Car phone IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXXX 200X 4 Car phone-B 39 197 203 193 TABLE 3. COMPARISON OF PSNR VALUES (dB) FOR DIFFERENT TEST SEQUENCES FOR P AND B FRAMES. PSNR (dB) Test clip (QCIF) -Type of frames Fig. 3 A reconstructed I-frame from H.264 decoder QP = 45 without using a loop filter Fig. 4 A reconstructed I-frame from H.264 decoder QP = 45 with proposed method I. TOTAL NUMBER OF BITS USED QP Reconstruction without Loop Filter JM 9.8 Reconstruction with Loop Filter JM 9.8 2) Foreman-P 39 29.883 29.834 29.692 News-P 39 27.846 27.528 27.771 Car phoneP 39 30.271 30.024 30.171 Foreman-B 39 28.925 28.879 29.054 News-B 39 28.085 28.307 27.982 Car phoneB 39 29.637 29.438 29.491 REFERENCES Fig.5 A reconstructed I-frame from H.264 decoder QP = 45 with JM (H.264 software) method TABLE 2. COMPARISON OF TOTAL NUMBER OF BITS USED FOR ENCODING A P OR B FRAME IN H.264 COMPRESSED SEQUENCES. Test clip (QCIF) -Type of frames QP Reconstruction with Proposed Method 1) Reconstruction with Proposed Method Reconstruction without Loop filter JM 9.8 Reconstruction with Loop filter JM 9.8 Foreman-P 39 2085 2131 2107 News-P 39 4119 4074 4235 Car phone-P 39 817 897 720 Foreman-B 39 489 515 499 News-B 39 1424 1802 1687 [1] M.-Y. Shen and C.C Jay Kuo, “Review of postprocessing techniques for compression artifact removal”, Journal of Visual Communication and Image Representation, vol. 9, pp. 2-14, Mar. 1998. [2] A. Zakhor, “Iterative procedures for reduction of blocking effects in transform image coding,” IEEE Trans. CSVT, vol.2, pp. 91-95, Mar. 1992. [3] A. Z. Averbuch, A. Schlar, and D. L. Donoho, “Deblocking of blocktransform compressed images using weighted sums of symmetrically aligned pixels,” IEEE Trans. Image Processing, vol.14, pp. 200-212, Feb. 2005. [4] P. List et al, “Adaptive deblocking filter,” IEEE Trans. CSVT, vol.13, pp. 614-619, July 2003. [5] V. Lappalainen, A. Hallapuro, and T. D. Hamalainen, “Complexity of optimized H.26L video decoder implementation”, IEEE Trans. CSVT, vol. 13, pp. 717-725, July 2003. [6] M. N. Bojnordi, O. Fatemi, and M. R. Hashemi, “An efficient deblocking filter with self- transposing memory architecture for H.264/AVC”, ICASSP, vol. II, pp. 925-928, May 2006. [7] G. khurana, et al, “A pipeline hardware implementation of in-loop deblocking filter in H.264/AVC”, IEEE Trans. on Consumer Electronics, Vol.52, No.2, May 2006. [8] S. C. Chang, et al, “A platform based bus-interleaved architecture for deblocking filter in H.264/MPEG-4 AVC”, IEEE Int’l Conf. on Consumer Electronics, 2005. [9]H.264 software (JM9.8/FRExt) from http://iphome.hhi.de/suehring/tml/download/jm98.zip. [10] H. R. Wu and K. R. Rao, “Digital video image quality and perceptual coding”, Taylor and Francis, 2006. [11] Test sequences are obtained from: http://trace.eas.asu.edu/yuv/qcif.html
© Copyright 2026 Paperzz