Transcoding of an MPEG-2 bit stream to an H.264 bit stream What is Transcoding ? The operation of converting video in one format to another format. Need: Compatibility between MPEG-2 and H.264 devices Applications: To adapt the bit rate of a compressed stream to the channel bandwidth, to change the spatial or temporal resolution of a compressed stream etc. Criteria considered in Heterogeneous Transcoding Quality of the transcoded stream should be comparable to that obtained by complete decoding and re-encoding with full motion search and to that of the initial input stream. The information in the input bit stream should be reused as much as possible to reduce multigenerational degradation. The computational cost and complexity should be kept minimal. MPEG-2 Decoder MPEG-2 bit stream Variable Length Decoding Inverse Scan Inverse Quantization Inverse DCT Frame store Memory + + Motion Compensation Decoded Pels H.264 encoder Transcoding Algorithm Intra frame coding MPEG-2 H.264 Macroblock modes supported 8x8 16x16 with 4 directional modes, 4x4 with 9 directional modes Type of intra prediction Fixed prediction of D.C. coefficient Adaptive directional prediction of 4x4 or 16x6 pixel blocks Transform 8x8 DCT 4x4 Integer transform Intra Frame Transcoding VLC Decode Inverse Quantize IDCT MPEG-2 bit stream Mode Decision Spatial Prediction Entropy Coding H.264 bit stream Quantize 4x4 Integer Transform + - Complexity of applying mode decisions in transform domain Example: Vertical prediction Predicted block = Intra modes in H.264/AVC Directional modes for an intra 4x4 macroblock Directional prediction 8 1 6 3 4 7 0 5 Directional modes for an intra 16x16 macroblock Mode decision algorithm Why use standard deviation? Simple metric Can be easily computed as the transform domain coefficients are already available. Post mode decision Intra Frame transcoding results Subjective Quality of Intra frames MPEG-2 Input Stream Test clip :Akiyo Bit rate: 1 Mbps Spatial resolution: 352x240 H.264 bit stream obtained by the proposed method Bit rate: 768 Kbps H.264 bit stream obtained from complete decoding and re-encoding of the input MPEG-2 bit stream Bit rate: 670 Kbps Subjective Quality of Intra frames MPEG-2 Input stream Test clip: Foreman Bit rate: 1 Mbps bit Spatial resolution: 352x240 H.264 bit stream obtained by the proposed method Bit rate: 1Mbps H.264 bit stream bit obtained from complete decoding and re-encoding of the input MPEG-2 stream. Bit rate: 1Mbps Inter frame coding MPEG-2 H.264 MC prediction with ¼ pel accuracy No, only ½ pel accuracy yes MC modes 16x16 16x16,16x8,8x16, 8x8,8x4,4x8, 4x4 Multiple reference prediction no yes Direct modes in B frames no yes Use of B frames as reference frames no Allowed, can be selected by the user Inter frame transcoding MPEG-2 bit stream VLD Pass parameters H.264 bit stream Inverse quantise IDCT MV Refinement Sum residuals Hierarchical mode decisions Inter prediction Rate control VLC/CABAC Inverse VLC/CABAC Inverse Quantise Quantise Motion compensate 4x4 Integer transform Store as reference frame Inter frame transcoding Features: Motion vector extraction Motion vector refinement Motion vector reuse Hierarchical mode decision Motion vector extraction Motion vectors can be extracted from the MPEG-2 bit stream after variable length decoding. Need for motion vector refinement Need: Differences in the quantization parameters of the incoming bit stream and those selected may differ. When these differences are large it results in quality degradation. MPEG-2 supports certain modes in which no motion information is coded. However, since H.264 supports more fine motion estimation block sizes, a small amount of motion may result upon refinement. Re-evaluation of the decision to intra code macroblocks in a P frame. Improves accuracy of the motion vectors and helps achieve compatibility between ½ pel MV accuracy in MPEG-2 and ¼ pel MV accuracy in H.264 Need for motion vector refinement Compensates for field coding to frame coding changes and vice versa Motion vector refinement MPEG-2 motion vectors are refined over a one pixel window i.e. dx = dy = 3 pixels , in the most recent reference frame in List 0. Half pixel and quarter pixel refinement is performed with the defined window. Search window size (dx,dy) selection Before window size selection, different increasing window sizes were tested to verify the effect of varying the search window size on the PSNR. The graph for one such test clip Akiyo is as follows It was observed that the PSNR obtained for a one pixel window closely approximated the steady state value and using a one pixel window provided a good tradeoff between complexity and the PSNR. Frame window size test (Clip Akiyo) ---> PSNR (dB) 42.615 42.61 42.605 42.6 42.595 42.59 42.585 42.58 42.575 Frame window size test 0 2 4 6 8 10 --> Search Window Sizes (dx,dy) 12 Motion vector reuse Hierarchical mode decisions Coding modes are compared and selected based on the sum of absolute difference (SAD) value. In the full mode decision method, every coding mode is evaluated ,the SAD value is computed and the mode with the minimum SAD value is selected as the best mode. However , although this method would give the best results ,it is very computationally intensive. For instance, each macroblock in the P frame would have to be evaluated for 16x16, 2 16x8, 2 8x16, 4 8x8, 8 8x4,8 4x8, 16 4x4 intra and skip modes Hierarchical mode decision process makes use of the fact that after evaluating a mode and the next level of sub partitioned modes , if sub partitioning does not reduce the SAD value then further sub partitioning need not be evaluated. Hierarchical mode decisions The top down splitting approach is shown below P frame transcoding Results P frame results (see previous slide) P frame results P Fram e Overall m otion estim ation tim e (MET) com parison 5000 MET(ms) w ith motion vector reuse and hierarchical mode decisions 4500 ---> MET (ms) 4000 3500 3000 MET(ms) w ith f ull motion search 2500 2000 1500 1000 500 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ---> Clip Num ber Motion vectors in P frames MPEG-2 Input Stream H.264 bit stream obtained H.264 bit stream Test clip :Akiyo by the proposed method obtained from complete Bit rate: 1 Mbps Bit rate: 768 Kbps decoding and re-encoding Spatial Resolution:352x240 of the input MPEG-2 bit stream Bit rate: 670 Kbps MPEG-2 motion vectors H.264 motion vectors after transcoding H.264 motion vectors after full motion search Mode decisions in P frames MPEG-2 Input Stream stream. Test clip :Akiyo Bit rate: 1 Mbps Spatial resolution:352x240 16x16 modes in the MPEG-2 bit stream H.264 bit stream obtained by the proposed method Bit rate: 768 Kbps 16x16 and sub-macroblock modes in the H.264 transcoded bit stream H.264 bit stream bit obtained from complete decoding and reencoding of the input MPEG-2 bit stream Bit rate: 670 Kbps 16x16 and sub-macroblock modes in the re-encoded H.264 bit stream B frame transcoding results B frame transcoding results Comparison of PSNR Values for transcoding with and without the hierarchical mode decision for the test clip Akiyo (spatial resolution:352x240) ---> PSNR(dB) 60 50 Proposed method PSNR 40 30 Proposed method w/o hierarchical mode decision PSNR" 20 10 0 0 500 1000 1500 2000 2500 3000 ---> Bitrate(kbps) ---> Execution time(ms) Comparison of Exe cution time for the propose d me thod with and without hie rarchical mode de cision for the te st clip Akiyo (spatial re solution: 352x240) 7000 6000 5000 Proposed method execution time 4000 3000 Proposed method w/o heirarchical mode decision 2000 1000 0 0 500 1000 1500 2000 ---> Bitrate (kbps) 2500 3000 B frame transcoding results Motion vectors in B frames MPEG-2 Input Stream Test clip :Akiyo Bit rate: 1 Mbps Spatial resolution:352x240 Forward motion vectors in MPEG-2 Backward motion vectors in MPEG-2 H.264 bit stream obtained by the proposed method Bit rate: 768 Kbps Backward motion vectors in the H.264 transcoded bit stream Forward motion vectors in the H.264 transcoded bit stream H.264 bit stream. obtained from complete decoding and re-encoding of the input MPEG-2 bit stream Bit rate: 670 Kbps Forward motion vectors in the H.264 transcoded bit stream Backward motion vectors in the H.264 transcoded bit stream Mode decisions in B frames MPEG-2 Input Stream stream. Test clip :Akiyo Bit rate: 1 Mbps Spatial resolution:352x240 16x16 modes in the MPEG-2 bit stream H.264 bit stream obtained by the proposed method Bit rate: 768 Kbps 16x16 and sub 16x16 modes in the H.264 transcoded bit stream H.264 bit stream bit obtained from complete decoding and re-encoding of the input MPEG-2 bit stream Bit rate: 670 Kbps 16x16 and sub 16x16 modes in the H.264 re-encoded bit stream B frame transcoding results 44 43 42 41 40 39 38 37 36 Comparison of motion estimation time (MET) for B frames Proposed method PSNR Complete decoding and reencoding PSNR 0 2 4 ---> test clip number 6 8 ---> MET(ms) --> PSNR(dB) PSNR Comparison for B frames 9000 8000 7000 6000 5000 4000 3000 2000 1000 0 Proposed method MET(ms) Complete decoding and reencoding execution time 0 2 4 ---> Test clip number 6 8 Comparison of the Input MPEG-2 bit stream vs. the transcoded H.264 bit stream The table below illustrates the comparison between the PSNR of the input MPEG-2 bit stream and the PSNR of the transcoded H.264 bit stream obtained by transcoding 35 frames at 1Mbps with the IBBPBBP … GOP structure Comparison of the proposed method with the DCT domain transcoder proposed by Chang and Messerschmitt [23] The graph shown compares the proposed method with DCT domain transcoding [23] and complete decoding and re-encoding of a 1 Mbps MPEG-2 bit stream (test clip Foreman) to an H.264 bit stream with an IBBPBBP….GOP structure at a constant bit rate. Comparison of the Proposed method(PM), Complete decoding and re-encoding (CDRE) and DCT Domain Transcoding(DDT) -->PSNR (dB) 44 42 40 38 PM 36 34 CDRE DDT 32 30 0 5 10 15 --> fram e num ber 20 25 30 Proposed method transcoded stream Proposed method Full re-encoding References [1] J. Youn and M-T. Sun , “Motion Vector Refinement for high-performance transcoding”, in IEEE Int. Conf. Consumer Electronics, Los Angeles, CA, Vol. 1, Issue 1, pp. 30-40, March 1999. [2] J. Xin, C-W. Lin and M-T. Sun, “Digital Video Transcoding” , Proceedings of the IEEE, Vol. 93, pp. 84-97, Jan. 2005. [3] T. Wiegand et. al., “Overview of the H.264/AVC Video Coding Standard”, IEEE Trans. CSVT, Vol. 13, pp. 560576, July 2003. [4] A. Vetros, C. Christopoulos and H. Sun, “Video transcoding architectures and techniques: an overview”, IEEE Signal Processing magazine, Vol. 20, pp. 18-29,March 2003. [5] H. Kalva, “Issues in H.264/MPEG-2 Video Transcoding”, IEEE Consumer Communications and Networking Conf., CCNC 2004, pp 657-659, Jan 2004. [6] Information Technology-Generic coding of moving pictures and associated audio information: Video, ITU-T Rec. H.262 (2000 E). [7] B. Haskell, A. Puri and A. Netravali, “Digital Video: an introduction to MPEG-2”, N.Y. Chapman and Hall, International Thomson Pub., 1997. [8] G. Chen et. al., “Efficient block size selection for MPEG-2 to H.264 transcoding”, Proceedings of the 12th annual ACM International Conference on Multimedia, pp. 300-303, Oct. 2004. [9] MPEG-2 software (version 12) from MPEG software simulation group, http://www.mpeg.org/MPEG/MSSG/#source [10] H.264 Software (JM9.5) from http://iphome.hhi.de/suehring/tml/download/jm94.zip [11] A. Puri, X. Chen and A. Luthra, “Video coding using the H.264/MPEG-4 AVC compression standard”, Signal processing: Image communication, Vol. 19, pp. 793-849, Oct. 2004. [12] B. Shen and I. Sethi, “Direct feature extraction from compressed images”, SPIE: Vol. 2670 Storage and Retrieval for Image Databases IV, pp. 404-414, 1996. [13] Commercially available transcoders, PSP Video 9, http://www.pspvideo9.com [14] K.R. Rao and J. J. Hwang, “Techniques and Standards for Image, Video and Audio coding”, Upper Saddle River, N.J.: Prentice Hall, 1996. References continued… [15] M. Ghanbari, “Video Coding: an introduction to standard codecs”, London, U.K.: Institution of Electrical Engineers, 1999. [16] I. E. G. Richardson, “H.264 and MPEG-4 video compression: video coding for next generation multimedia”, Chichester: Wiley, 2003. [17]Test streams obtained from ftp://ftp.tek.com/tv/test/streams/Element/MPEG-Video/525/ and http://www.cipr.rpi.edu/resource/sequences/sif.html [18] Y-J. Chuang, Y-C. Huang and J-L Wu, “An efficient block algorithm for splitting an 8x8 DCT into four 4x4 modified DCT used in AVC/H.264”, EURASIP 2005, pp. 311-316. [19] P. Assunco and M. Ghanbari, “Post Processing of MPEG-2 coded video for transmission at lower bit rates”, Proc. IEEE ICASSP, pp. 1998-2001, Atlanta, GA, 1996. [20] T. Shanableh and M. Ghanbari, “Transcoding Architectures for DCT domain heterogeneous video transcoding”, Proc. IEEE ICIP, Vol. 1, pp. 433-436, Thessaloniki, Greece, Sept. 2001,. [21] J. Xin, M.T. Sun and K. Chun, “Motion re-estimation for MPEG-2 to MPEG-4 simple profile transcoding”, Proc. Int. Workshop Packet Video, Pittsburgh, PA, Apr. 2002. [22] D-Y. Chan, S-J. Lin and C-Y. Chang, “A rate control scheme using Kalman filtering for H.263”, Journal of Visual Communication and Image Representation, Vol. 16, pp. 734-748, Dec. 2005. [23] S. Liu and A. Bovik, “Foveated embedded DCT domain video transcoding”, Journal of Visual Communication and Image Representation, Vol. 16, pp. 643-667, Dec. 2005. [24] I. E. G. Richardson, “Video codec design: developing image and video compression systems”, Chichester: Wiley, 2002. [25] G. Sullivan, T. Wiegand and A. Luthra, “Draft of Version 4 of H.264/AVC (ITU-T Recommendation H.264 and ISO/IEC 14496-10 (MPEG-4 part 10) Advanced Video Coding)”, JVT Doc., 14th Meeting: Hong Kong, China 18-21 Jan. 2005. [26] G. F-Escribano et.al., “Computational complexity reduction of intra frame prediction in MPEG2/H.264 video transcoders”, ICME, pp. 707-710, July 2005. References continued… [27] I. Ahmad et. al., “Video transcoding: an overview of various techniques and research issues”, IEEE Trans. on multimedia, vol. 7, pp. 793-804, Oct. 2005. [28] S. Benyaminovich, O. Hadar and E. Kaminsky, “Optimal transrating via DCT coefficients modification and dropping”, ITRE, pp. 100-104, June 2005. [29] J-R. Ohm, “Advances in scalable video coding”, Proc. IEEE, Vol. 93, pp. 42-56, Jan. 2005. [30] J. Wang et. al., “An AVS to MPEG-2 transcoding system”, Proc. of ISIMP, pp.302-305, Oct. 2004. [31] J. McVeigh et. al., “A software based real-time MPEG-2 video encoder”, IEEE Trans. CSVT, Vol. 10, pp. 11781184, Oct. 2000.
© Copyright 2026 Paperzz