PPT

Transcoding of an
MPEG-2 bit stream to an
H.264 bit stream
What is Transcoding ?



The operation of converting video in one format to
another format.
Need: Compatibility between MPEG-2 and H.264
devices
Applications: To adapt the bit rate of a compressed
stream to the channel bandwidth, to change the spatial
or temporal resolution of a compressed stream etc.
Criteria considered in Heterogeneous
Transcoding



Quality of the transcoded stream should be comparable
to that obtained by complete decoding and re-encoding
with full motion search and to that of the initial input
stream.
The information in the input bit stream should be reused as much as possible to reduce multigenerational
degradation.
The computational cost and complexity should be kept
minimal.
MPEG-2 Decoder
MPEG-2 bit
stream
Variable Length
Decoding
Inverse
Scan
Inverse
Quantization
Inverse
DCT
Frame store
Memory
+
+
Motion
Compensation
Decoded Pels
H.264 encoder
Transcoding Algorithm
Intra frame coding
MPEG-2
H.264
Macroblock modes supported
8x8
16x16 with 4 directional modes,
4x4 with 9 directional modes
Type of intra prediction
Fixed prediction of D.C.
coefficient
Adaptive directional prediction
of 4x4 or 16x6 pixel blocks
Transform
8x8 DCT
4x4 Integer transform
Intra Frame Transcoding
VLC Decode
Inverse Quantize
IDCT
MPEG-2
bit stream
Mode
Decision
Spatial
Prediction
Entropy
Coding
H.264
bit
stream
Quantize
4x4 Integer
Transform
+
-
Complexity of applying mode decisions in
transform domain

Example: Vertical
prediction
Predicted block =
Intra modes in H.264/AVC
Directional modes for an intra 4x4 macroblock
Directional prediction
8
1
6
3
4
7
0
5
Directional modes for an intra 16x16 macroblock
Mode decision algorithm
Why use standard deviation?


Simple metric
Can be easily computed as the transform domain coefficients are already
available.
Post mode decision
Intra Frame transcoding results
Subjective Quality of Intra frames
MPEG-2 Input Stream
Test clip :Akiyo
Bit rate: 1 Mbps
Spatial resolution: 352x240
H.264 bit stream obtained
by the proposed method
Bit rate: 768 Kbps
H.264 bit stream
obtained from complete
decoding and re-encoding
of the input MPEG-2 bit
stream
Bit rate: 670 Kbps
Subjective Quality of Intra frames
MPEG-2 Input
stream
Test clip: Foreman
Bit rate: 1 Mbps
bit Spatial resolution: 352x240
H.264 bit stream obtained
by the proposed method
Bit rate: 1Mbps
H.264 bit stream bit
obtained from complete
decoding and re-encoding
of the input MPEG-2
stream.
Bit rate: 1Mbps
Inter frame coding
MPEG-2
H.264
MC prediction with ¼ pel
accuracy
No, only ½ pel accuracy
yes
MC modes
16x16
16x16,16x8,8x16,
8x8,8x4,4x8, 4x4
Multiple reference prediction
no
yes
Direct modes in B frames
no
yes
Use of B frames as reference
frames
no
Allowed, can be selected by the
user
Inter frame transcoding
MPEG-2 bit
stream
VLD
Pass
parameters
H.264 bit
stream
Inverse
quantise
IDCT
MV
Refinement
Sum
residuals
Hierarchical
mode decisions
Inter prediction
Rate control
VLC/CABAC
Inverse
VLC/CABAC
Inverse
Quantise
Quantise
Motion
compensate
4x4 Integer
transform
Store as
reference frame
Inter frame transcoding
Features:
 Motion vector extraction
 Motion vector refinement
 Motion vector reuse
 Hierarchical mode decision
Motion vector extraction

Motion vectors can be extracted from the MPEG-2 bit stream
after variable length decoding.
Need for motion vector refinement
Need:
 Differences in the quantization parameters of the incoming bit stream and
those selected may differ. When these differences are large it results in quality
degradation.
 MPEG-2 supports certain modes in which no motion information is coded.
However, since H.264 supports more fine motion estimation block sizes, a
small amount of motion may result upon refinement.
 Re-evaluation of the decision to intra code macroblocks in a P frame.
 Improves accuracy of the motion vectors and helps achieve compatibility
between ½ pel MV accuracy in MPEG-2 and ¼ pel MV accuracy in H.264
Need for motion vector refinement

Compensates for field coding to frame coding changes and vice versa
Motion vector refinement

MPEG-2 motion vectors are refined over a one pixel window i.e. dx = dy = 3 pixels ,

in the most recent reference frame in List 0.
Half pixel and quarter pixel refinement is performed with the defined
window.
Search window size (dx,dy) selection


Before window size selection, different
increasing window sizes were tested to
verify the effect of varying the search
window size on the PSNR.
The graph for one such test clip Akiyo
is as follows
It was observed that the PSNR
obtained for a one pixel window closely
approximated the steady state value
and using a one pixel window provided
a good tradeoff between complexity
and the PSNR.
Frame window size test (Clip Akiyo)
---> PSNR (dB)

42.615
42.61
42.605
42.6
42.595
42.59
42.585
42.58
42.575
Frame window size test
0
2
4
6
8
10
--> Search Window Sizes (dx,dy)
12
Motion vector reuse
Hierarchical mode decisions


Coding modes are compared and selected based on the sum of absolute
difference (SAD) value. In the full mode decision method, every coding
mode is evaluated ,the SAD value is computed and the mode with the
minimum SAD value is selected as the best mode. However , although this
method would give the best results ,it is very computationally intensive. For
instance, each macroblock in the P frame would have to be evaluated for
16x16, 2 16x8, 2 8x16, 4 8x8, 8 8x4,8 4x8, 16 4x4 intra and skip modes
Hierarchical mode decision process makes use of the fact that after evaluating
a mode and the next level of sub partitioned modes , if sub partitioning does
not reduce the SAD value then further sub partitioning need not be
evaluated.
Hierarchical mode decisions

The top down splitting approach is shown
below
P frame transcoding Results
P frame results
(see previous slide)
P frame results
P Fram e Overall m otion estim ation tim e (MET) com parison
5000
MET(ms) w ith motion
vector reuse and
hierarchical mode
decisions
4500
---> MET (ms)
4000
3500
3000
MET(ms) w ith f ull motion
search
2500
2000
1500
1000
500
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14
---> Clip Num ber
Motion vectors in P frames
MPEG-2 Input Stream
H.264 bit stream obtained
H.264 bit stream
Test
clip :Akiyo
by the proposed method
obtained from complete Bit rate: 1
Mbps
Bit rate: 768 Kbps
decoding and re-encoding Spatial
Resolution:352x240
of the input MPEG-2 bit
stream
Bit rate: 670 Kbps
MPEG-2 motion vectors
H.264 motion vectors after transcoding
H.264 motion vectors after full motion search
Mode decisions in P frames
MPEG-2 Input Stream
stream.
Test clip :Akiyo
Bit rate: 1 Mbps
Spatial resolution:352x240
16x16 modes in the
MPEG-2 bit stream
H.264 bit stream obtained
by the proposed method
Bit rate: 768 Kbps
16x16 and sub-macroblock modes
in the H.264 transcoded bit stream
H.264 bit stream bit
obtained from complete
decoding and reencoding of the input
MPEG-2 bit stream
Bit rate: 670 Kbps
16x16 and sub-macroblock modes
in the re-encoded H.264 bit stream
B frame transcoding results
B frame transcoding results
Comparison of PSNR Values for transcoding with and without the hierarchical
mode decision for the test clip Akiyo (spatial resolution:352x240)
---> PSNR(dB)
60
50
Proposed method PSNR
40
30
Proposed method w/o hierarchical
mode decision PSNR"
20
10
0
0
500
1000
1500
2000
2500
3000
---> Bitrate(kbps)
---> Execution time(ms)
Comparison of Exe cution time for the propose d me thod with and without
hie rarchical mode de cision for the te st clip Akiyo (spatial re solution: 352x240)
7000
6000
5000
Proposed method execution
time
4000
3000
Proposed method w/o
heirarchical mode decision
2000
1000
0
0
500
1000
1500
2000
---> Bitrate (kbps)
2500
3000
B frame transcoding results
Motion vectors in B frames
MPEG-2 Input Stream
Test clip :Akiyo
Bit rate: 1 Mbps
Spatial resolution:352x240
Forward motion vectors in MPEG-2
Backward motion vectors in MPEG-2
H.264 bit stream obtained
by the proposed method
Bit rate: 768 Kbps
Backward motion vectors in the H.264
transcoded bit stream
Forward motion vectors in the
H.264 transcoded bit stream
H.264 bit stream.
obtained from complete
decoding and re-encoding
of the input MPEG-2 bit
stream
Bit rate: 670 Kbps
Forward motion vectors in the
H.264 transcoded bit stream
Backward motion vectors in the
H.264 transcoded bit stream
Mode decisions in B frames
MPEG-2 Input Stream
stream.
Test clip :Akiyo
Bit rate: 1 Mbps
Spatial resolution:352x240
16x16 modes in the MPEG-2 bit stream
H.264 bit stream obtained
by the proposed method
Bit rate: 768 Kbps
16x16 and sub 16x16 modes in the
H.264 transcoded bit stream
H.264 bit stream bit
obtained from complete
decoding and re-encoding
of the input MPEG-2 bit
stream
Bit rate: 670 Kbps
16x16 and sub 16x16 modes in the
H.264 re-encoded bit stream
B frame transcoding results
44
43
42
41
40
39
38
37
36
Comparison of motion estimation time (MET) for B frames
Proposed method PSNR
Complete decoding and
reencoding PSNR
0
2
4
---> test clip number
6
8
---> MET(ms)
--> PSNR(dB)
PSNR Comparison for B frames
9000
8000
7000
6000
5000
4000
3000
2000
1000
0
Proposed method MET(ms)
Complete decoding and
reencoding execution time
0
2
4
---> Test clip number
6
8
Comparison of the Input MPEG-2 bit stream
vs. the transcoded H.264 bit stream

The table below illustrates the comparison between the PSNR of the input MPEG-2 bit stream
and the PSNR of the transcoded H.264 bit stream obtained by transcoding 35 frames at 1Mbps
with the IBBPBBP … GOP structure
Comparison of the proposed method with the
DCT domain transcoder proposed by Chang
and Messerschmitt [23]
The graph shown compares the proposed method with DCT domain transcoding [23]
and complete decoding and re-encoding of a 1 Mbps MPEG-2 bit stream (test clip
Foreman) to an H.264 bit stream with an IBBPBBP….GOP structure at a constant bit
rate.
Comparison of the Proposed method(PM), Complete decoding and
re-encoding (CDRE) and DCT Domain Transcoding(DDT)
-->PSNR (dB)

44
42
40
38
PM
36
34
CDRE
DDT
32
30
0
5
10
15
--> fram e num ber
20
25
30
Proposed method transcoded stream

Proposed method

Full re-encoding
References
[1] J. Youn and M-T. Sun , “Motion Vector Refinement for high-performance transcoding”, in IEEE Int. Conf.
Consumer Electronics, Los Angeles, CA, Vol. 1, Issue 1, pp. 30-40, March 1999.
[2] J. Xin, C-W. Lin and M-T. Sun, “Digital Video Transcoding” , Proceedings of the IEEE, Vol. 93, pp. 84-97, Jan.
2005.
[3] T. Wiegand et. al., “Overview of the H.264/AVC Video Coding Standard”, IEEE Trans. CSVT, Vol. 13, pp. 560576, July 2003.
[4] A. Vetros, C. Christopoulos and H. Sun, “Video transcoding architectures and techniques: an overview”, IEEE
Signal Processing magazine, Vol. 20, pp. 18-29,March 2003.
[5] H. Kalva, “Issues in H.264/MPEG-2 Video Transcoding”, IEEE Consumer Communications and Networking
Conf., CCNC 2004, pp 657-659, Jan 2004.
[6] Information Technology-Generic coding of moving pictures and associated audio information: Video, ITU-T
Rec. H.262 (2000 E).
[7] B. Haskell, A. Puri and A. Netravali, “Digital Video: an introduction to MPEG-2”, N.Y. Chapman and Hall,
International Thomson Pub., 1997.
[8] G. Chen et. al., “Efficient block size selection for MPEG-2 to H.264 transcoding”, Proceedings of the 12th
annual ACM International Conference on Multimedia, pp. 300-303, Oct. 2004.
[9] MPEG-2 software (version 12) from MPEG software simulation group,
http://www.mpeg.org/MPEG/MSSG/#source
[10] H.264 Software (JM9.5) from http://iphome.hhi.de/suehring/tml/download/jm94.zip
[11] A. Puri, X. Chen and A. Luthra, “Video coding using the H.264/MPEG-4 AVC compression standard”, Signal
processing: Image communication, Vol. 19, pp. 793-849, Oct. 2004.
[12] B. Shen and I. Sethi, “Direct feature extraction from compressed images”, SPIE: Vol. 2670 Storage and
Retrieval for Image Databases IV, pp. 404-414, 1996.
[13] Commercially available transcoders, PSP Video 9, http://www.pspvideo9.com
[14] K.R. Rao and J. J. Hwang, “Techniques and Standards for Image, Video and Audio coding”, Upper Saddle
River, N.J.: Prentice Hall, 1996.
References continued…
[15] M. Ghanbari, “Video Coding: an introduction to standard codecs”, London, U.K.: Institution of Electrical
Engineers, 1999.
[16] I. E. G. Richardson, “H.264 and MPEG-4 video compression: video coding for next generation multimedia”,
Chichester: Wiley, 2003.
[17]Test streams obtained from ftp://ftp.tek.com/tv/test/streams/Element/MPEG-Video/525/ and
http://www.cipr.rpi.edu/resource/sequences/sif.html
[18] Y-J. Chuang, Y-C. Huang and J-L Wu, “An efficient block algorithm for splitting an 8x8 DCT into four 4x4
modified DCT used in AVC/H.264”, EURASIP 2005, pp. 311-316.
[19] P. Assunco and M. Ghanbari, “Post Processing of MPEG-2 coded video for transmission at lower bit rates”,
Proc. IEEE ICASSP, pp. 1998-2001, Atlanta, GA, 1996.
[20] T. Shanableh and M. Ghanbari, “Transcoding Architectures for DCT domain heterogeneous video
transcoding”, Proc. IEEE ICIP, Vol. 1, pp. 433-436, Thessaloniki, Greece, Sept. 2001,.
[21] J. Xin, M.T. Sun and K. Chun, “Motion re-estimation for MPEG-2 to MPEG-4 simple profile transcoding”,
Proc. Int. Workshop Packet Video, Pittsburgh, PA, Apr. 2002.
[22] D-Y. Chan, S-J. Lin and C-Y. Chang, “A rate control scheme using Kalman filtering for H.263”, Journal of
Visual Communication and Image Representation, Vol. 16, pp. 734-748, Dec. 2005.
[23] S. Liu and A. Bovik, “Foveated embedded DCT domain video transcoding”, Journal of Visual Communication
and Image Representation, Vol. 16, pp. 643-667, Dec. 2005.
[24] I. E. G. Richardson, “Video codec design: developing image and video compression systems”, Chichester:
Wiley, 2002.
[25] G. Sullivan, T. Wiegand and A. Luthra, “Draft of Version 4 of H.264/AVC (ITU-T Recommendation H.264
and ISO/IEC 14496-10 (MPEG-4 part 10) Advanced Video Coding)”, JVT Doc., 14th Meeting: Hong Kong,
China 18-21 Jan. 2005.
[26] G. F-Escribano et.al., “Computational complexity reduction of intra frame prediction in MPEG2/H.264 video
transcoders”, ICME, pp. 707-710, July 2005.
References continued…
[27] I. Ahmad et. al., “Video transcoding: an overview of various techniques and research issues”, IEEE Trans. on
multimedia, vol. 7, pp. 793-804, Oct. 2005.
[28] S. Benyaminovich, O. Hadar and E. Kaminsky, “Optimal transrating via DCT coefficients modification and
dropping”, ITRE, pp. 100-104, June 2005.
[29] J-R. Ohm, “Advances in scalable video coding”, Proc. IEEE, Vol. 93, pp. 42-56, Jan. 2005.
[30] J. Wang et. al., “An AVS to MPEG-2 transcoding system”, Proc. of ISIMP, pp.302-305, Oct. 2004.
[31] J. McVeigh et. al., “A software based real-time MPEG-2 video encoder”, IEEE Trans. CSVT, Vol. 10, pp. 11781184, Oct. 2000.