FAST MODE DECISION ALGORITHM FOR INTRA PREDICTION IN HEVC INTERIM REPORT LANKA NAGA VENKATA SAI SURYA TEJA STUDENT ID: 1000916473 MAIL ID [email protected] DATE : 04/24/2014 UNDER THE GUIDANCE OF DR. K. R. RAO EE 5359 MULTIMEDIA PROCESSING UNIVERSITY OF TEXAS AT ARLINGTON ACRONYMS: BD- Bitrate - Bjøntegaard Delta Bitrate BD- PSNR CU - Bjøntegaard Delta Peak Signal-to-Noise Ratio - Coding Unit DCT - Discrete Cosine Transform DST - Discrete Sine Transform HEVC - High Efficiency Video Coding JCT- VC- Joint Collaborative Team on Video Coding LCU - Largest Coding Unit MPM - Most Probable Mode PSNR - Peak Signal-to-Noise Ratio PU - Prediction Unit QP - Quantization Parameter RDOQ - Rate Distortion Optimization Quantization RDO - Rate- Distortion Optimization RMD - Rough Mode Decision SSIM - Structural Similarity Index TU - Transform Unit PROPOSAL: To improve the coding efficiency of intra frame coding, up to 34 intra prediction modes are defined in High Efficiency Video Coding (HEVC) [1]. The best mode among these pre-defined intra prediction modes is selected by rate-distortion optimization (RDO) for each block. This project proposes a new method to reduce the candidates in RDO process, as it will be time-consuming if all directions are tested in the RDO process. Also in this project, it provides 20% and 28% time savings in intra high efficiency and low complexity cases on average compared to the default encoding scheme in HM 13.0 [5] with almost the same coding efficiency. Also based on PSNR, BD- PSNR and BD- Bitrate analysis of fast mode decision algorithm for intra prediction [] in HEVC can be done by comparing with the default encoding scheme in HM 13.0 [5]. Index Terms: video coding, HEVC, intra prediction INTRODUCTION: HEVC standard [2] provides a highly flexible hierarchy of unit representation which consists of three block concepts: coding unit (CU), prediction unit (PU), and transform unit (TU). This separation of the block structure is helpful for each unit of optimization. CU is a macroblock-like unit of region splitting which is always square and its size can be from 8x8 luma samples up to the largest coding units (LCUs). This concept allows recursive splitting into four equal sized blocks, starting from LCU. This process gives a contentadaptive coding tree structure comprised of CU blocks. Figure 4 shows coding tree structure [3]. The PU is used only for the CU which is the leaf node in the Quadtree structure and the size of two PUs are 2Nx2N and NxN. The third block concept transform unit size cannot exceed that of the CU. Figure 1 shows the block diagram of H.264 encoder [20]. Figure 1: Block Diagram of H.264 Encoder [20] Figure 2- Block diagram of HEVC encoder [15] Figure 2 shows block diagram of HEVC encoder [15] in which each picture is partitioned into blocks of different sizes and the same is conveyed to the decoder. In the given sequence intra prediction is applied to the very first picture which uses spatial redundancy of the picture while for rest of the frames temporal redundancy is exploited using inter prediction. Since encoder needs to exhaust all the combinations of CU, PU and TU to find the optimal solutions, it is very time-consuming. The encoder will not tolerate it if all the directions are employed in the ratedistortion optimization process. To reduce the computational complexity of the encoder, a fast intra mode decision [4] was adopted in HM 13.0 [5]. Figure 3 shows block diagram of HEVC decoder [15]. Figure 3- Block diagram of HEVC decoder [15] (c) (d) Figure 4- Coding tree structure [3] OVERVIEW OF INTRA PREDICTION IN HEVC: In H.264, intra prediction [6][7][8][9] is based on spatial extrapolation of samples from previously decoded image blocks, followed by integer discrete cosine transform (DCT) [10] based coding. HEVC utilizes the same principle, but further extends it to efficiently represent wider range of textural and structural information in images. HEVC contains several elements improving the efficiency of intra prediction over earlier solutions. The introduced methods can model accurately different structures as well as smooth regions with gradually changing sample values. Figure 4 shows the intra prediction modes of HEVC [7] and figure 5 shows the intra prediction modes of H.264 [21]. Figure 5- HEVC intra prediction modes [7] Figure 6: H.264 intra prediction modes [21] Prediction size 64x64 32x32 16x16 8x8 4x4 Total Modes Total Intra Angular modes HEVC/H.265(64x64) H.264/AVC(16x16) 5 34 34 34 17 64x(5+34+34+34+17)=7936 NA NA 4 9 9 16x(16x9+4x9+4)=2944 Table 1: Comparing HEVC Intra luma prediction modes for 64x64 LCU with H.264/AVC Intra modes for a 64x64 image region [11] METHOD PROPOSED FOR FAST MODE DECISON ALGORITHM FOR INTRA PREDICTION: The fast intra prediction consists of three steps. 1 - Hadamard Transformed Coefficients Of Residual Signal [13] 2 - Progressive Mode Search [13] 3 - Early RDOQ Termination [13] By combining these three steps, fast mode decision algorithm can be performed. The unified intra in HM13.0 first determines the first N best candidate modes selected by a rough mode decision (RMD) process where all modes are tested by minimum absolute sum of Hadamard transformed coefficients of residual signal and the mode bits in the rough mode decision. Instead of the total intra prediction modes decision, the RD optimization is only applied to the N best candidate modes selected by the rough mode decision where all modes are compared in this decision. However, computation load of the encoder is still very high. On the other side, the intra prediction modes are always correlated among the neighbors which are not considered in HM 13.0. Therefore, there is still some room for further reducing the encoder complexity. To further relieve the computation load of the encoder, it is important to reduce the candidates for RDO process and make full use of the information of its neighboring blocks. In this project, check for less number of best RMD modes for RDO, and the most probable mode (MPM) is always included in the candidates for RDO. TEST SEQUENCES USED: [1] BQSquare_416x240_60 [16] [2] BQMall_ 832x480_60 [16] [3] KristenAndSara_1280x720_60 [16] EXPERIMENTAL RESULTS OF BQMall_832x480_60: Tables 2 and 3 demonstrate the implementation results test sequence BQMall_832x480_60 with number of frames 30. Analysis of results for default method with number of frames 30: QP BITRATE PSNR ENCODING TIME (kbps) (avg) (sec) dB 24 23836.5280 41.1352 519.561 28 16075.4080 38.6716 515.1675 32 10704.5280 36.2715 426.831 34 8506.4960 34.9803 407.187 Table 2 : Implementation results for default method with number of frames 30 for BQMALL__832x480_60 Analysis of results for HM13.0 using algorithm with number of frames 30 : QP BITRATE PSNR ENCODING TIME (kbps) (avg) (sec) dB 24 24074.8932 40.9944 396.944 28 16236.1620 38.5300 383.799 32 10811.5732 36.1225 312.44 34 8591.5609 34.8425 297.246 Table 3: Implementation results for modified method with number of frames 30 for BQMALL__832x480_60 Tables 4 and 5 demonstrate the implementation results test sequence BQMall_832x480_60 with number of frames 10. Analysis of results for default method with number of frames 10 : QP BITRATE PSNR ENCODING TIME (kbps) (avg) (sec) dB 24 23754.8160 41.137 176.624 28 16010.5920 38.6744 161.116 32 10612.8000 36.2830 151.866 34 8447.7600 34.9856 134.619 Table 4: Implementation results for default method with number of frames 10 for BQMALL__832x480_60 Analysis of results for HM 13.0 using algorithm with number of frames 10 QP BITRATE PSNR ENCODING TIME (kbps) (avg) (sec) dB 24 24046.1849 40.9975 134.234 28 16227.4525 38.5344 120.837 32 10718.928 36.1320 112.380 34 8532.2376 34.8473 98.271 Table 5: Implementation results for modified method with number of frames 10 for BQMALL__832x480_60 Encoding Time Vs QP Graphs: Figures 7 and 8 illustrate the Encoding time and QP plot for test sequence BQMall_832x480_60 for number of frames 10 and 30 respectively. For number of frames 10 200.00 180.00 T i m e ( s e c ) E n c o d i n g 160.00 140.00 120.00 Encoding Tme(sec) for unmodifed method 100.00 80.00 Encoding Time (sec) for modified method 60.00 40.00 20.00 0.00 24 28 32 34 QP Figure 7: RD plot for BQMall_832x480_60 with number of frames 10 For number of frames 30 600 t 500 i 400 m e 300 Encoding Tme(sec) for unmodifed method ( s 200 e c 100 Encoding Time (sec) for modified method ) E n c o d i n g 0 24 28 32 34 QP Figure 8: RD plot for BQMall_832x480_60 with number of frames 30 Graph Between PSNR (avg) vs Bitrate: Figures 9 and 10 illustrate the Bitrate-PSNR plot for test sequence BQMall_832x480_60 for frame sizes 10 and 30 respectively. For number of frames 10 Chart Title 42 P 41 S 40 N 39 R 38 For default method For proposed method ( ) 37 a v 36 g 35 d 34 B 0 5000 10000Bitrate 15000 (kbps)20000 25000 30000 Figure 9: RD plot for BQMall_832x480_60 with number of frames 10 For Number of Frames 30 Chart Title 42 P 41 S N 40 R 39 For default method ( 38 For proposed method ) a 37 v 36 g 35 d B 34 0 5000 10000 15000 20000 Bitrate (kbps) 25000 30000 Figure 10: RD plot for BQMall_832x480_60 with number of frames 30 CONCLUSION: For the modified method compared to unmodified HM13.0, • 0.1-0.5 dB loss in the PSNR • 11-15 kbps increase in the bit-rate • 20-28 % reduction in encoding time WQVGA – SD sequences of 10 frames and 30 frames each were tested from QPs 24, 28, 32,34. visual quality was maintained. FUTURE WORK: BD-PSNR and BD-bitrate can be compared for default method and modified method. Also this modified method can be tested using test sequences BQSquare_416x240_60 and KristenAndSara_1280x720_60 for different QP values and number of frames of 10 and 30. REFERENCES: [1] G.J. Sullivan et al, Overview of the high efficiency video coding (HEVC) standard, IEEE Trans. circuits and systems for video technology, vol. 22, no.12, pp. 1649 – 1668, Dec. 2012. [2] JCT-VC, “WD1: Working Draft 1 of High-Efficiency Video Coding”, JCTVC-C403, JCTVC Meeting, Guangzhou, October 2010. [3] Coding tree structure - https://www.google.com/search?q=coding+tree+structure+in+hevc [4] Y. Piao et al, “Encoder improvement of unified intra prediction,” JCTVC-C207, Guangzhou, October 2010. [5] Software for HEVC : https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware . [6] T.L. Silva et al, ”HEVC intra coding acceleration based on tree inter-level mode correlation”, SPA 2013, Poznan, Poland, Sep. 2013. [7] H. Zhang and Z. Ma, ”Fast intra prediction for high efficiency video coding ”, Pacific Rim Conf. on Multimedia, PCM 2012, Singapore, Dec. 2012. [8] M. Zhang et al, ”An adaptive fast intra mode decision in HEVC ”, IEEE ICIP 2012, pp.221224, Orlando, FL, Sept.- Oct. 2012. [9] Y. Kim et al, “A fast intra-prediction method in HEVC using rate-distortion estimation based on Hadamard transform”, ETRI Journal, vol.35, #2, pp.270-280, Apr. 2013. [10] A. Saxena and F. Fernanades, “Mode dependent DCT/DST for intra prediction in block based image/video coding”, IEEE ICIP, pp. 1685-1688, Sept. 2011. [11] M. Khan et al, “An adaptive complexity reduction scheme with fast prediction unit decision for HEVC Intra encoding”, IEEE ICIP, pp. 1578-1582, Sept. 2013. [12] P. Mehta, “Complexity reduction for intra mode selection in HEVC using OpenMP”, course website: http://www-ee.uta.edu/Dip/Courses/EE5359/ Section: previous projects, Sub section: Projects (Spring 2014). [13] S. Vasudevan, “Fast intra prediction and fast residual quadtree encoding implementation in HEVC”, course website: http://www-ee.uta.edu/Dip/Courses/EE5359/ Section: previous projects, Sub section: Projects (Spring 2014). [14] K.R.Rao , D. N. Kim and J.J. Hwang ,” Video coding standards: AVS China, H.264/MPEG4 Part10, HEVC, VP6, DIRAC and VC-1"´, Springer, 2014. [15] G.Sullivan et al, “Standardized Extensions of the High Efficiency Video Coding (HEVC) Standard”, IEEE Journal of Selected Topics in Signal Processing, vol.7, No. 6, pp. 1001-1016, Dec 2013. [16] Test Sequences: ftp://ftp.kw.bbc.co.uk/hevc/hm-11.0-anchors/testsequences/ [17] F. Bossen et al, "HM Software Manual", JCT-VC of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, AHG chairs, January 2014. [18] B. Bross et al, “High Efficiency Video Coding (HEVC) Text Specification Draft 10”, Document JCTVC- L1003, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCTVC), Mar. 2013 available on http://phenix.it-sudparis.eu/jct/doc_end_user/current_document.php?id=7243 [19] JVT Draft ITU-T recommendation and final draft international standard of joint video specification (ITU-T Rec. H.264-ISO/IEC 14496-10 AVC), March 2003, JVT-G050 available on http://ip.hhi.de/imagecom_G1/assets/pdfs/JVT-G050.pdf [20] I.E.G. Richardson, “The H.264 advanced video compression standard”, 2nd Edition, Hoboken, NJ, Wiley, 2010. [21] Intra Prediction Modes of H.264 https://www.google.com/search?q=intra+prediction+modes+h.264 [22] Special issue on emerging research and standards in next generation video coding, IEEE Transactions on Circuits and Systems for Video Technology (CSVT), vol.22, pp. 1646-1909, Dec. 2012. [23] Special issue on emerging research and standards in next generation video coding, IEEE Transactions on Circuits and Systems for Video Technology (CSVT), vol.23, pp. 2009-2142, Dec. 2013. [24] IEEE Journal of Selected Topics in Signal Processing, Vol. 7, pp. 931 -1151, Dec. 2013.
© Copyright 2026 Paperzz