Lec13, Video III (Video Coding Standards), v1.00.pdf

Course Presentation
Multimedia Systems
Video III
(Video Coding Standards)
Mahdi Amiri
April 2011
Sharif University of Technology
Video Coding Standards
Standardization Organizations
Two organizations have dominated video compression
standardization
ITU-T Video Coding Experts Group (VCEG)
International Telecommunications Union –Telecommunications
Standardization Sector (ITU-T, a United Nations Organization,
formerly CCITT), Study Group 16, Question 6
ISO/IEC Moving Picture Experts Group (MPEG)
International Standardization Organization and International
Electrotechnical Commission, Joint Technical Committee Number
1, Subcommittee 29, Working Group 11
Page 1
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
Dynamics
VCEG is older and more focused on conventional (esp. low-delay) video
coding goals (e.g. good compression and packet-loss/error resilience)
MPEG is larger and takes on more ambitious goals (e.g. “object oriented
video”, “synthetic-natural hybrid coding”, and digital cinema)
Sometimes the major organizations team up (e.g. ISO, IEC and ITU teamed
up for both MPEG-2 and JPEG)
Relatively little industry consortium activity (DV and organizations that
tweak the video coding standards in minor ways, such as DVD, 3GPP, 3GPP2,
SMPTE, IETF, etc.)
Growing activity for internet streaming media outside of formal
standardization (e.g., Microsoft, Real Networks, Quicktime)
Page 2
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Gary J. Sullivan, Ph.D.
ITU-T VCEG Rapporteur | Chair
ISO/IEC MPEG Video Rapporteur | Co-Chair
ITU/ISO/IEC JVT Rapporteur | Co-Chair
Video Coding Standards
Page 3
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
Page 4
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
Moving Picture Experts Group (MPEG)
A working group of ISO/IEC in charge of the
development of standards for coded
representation of digital audio and video and
related data.
Established in 1988
23 years of activity
The number of independent standards: more than
125
Page 5
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
MPEG-1
The standard on which such products as Video CD and MP3 are based
MPEG-2
The standard on which such products as Digital Television set top boxes and DVD are
based;
MPEG-4
The standard for multimedia for the fixed and mobile web;
MPEG-7
The standard for description and search of audio and visual content;
Video Coding Standards
MPEG-21 The Multimedia Framework;
MPEG-A
The standard providing application-specific formats by integrating multiple MPEG
technologies;
MPEG-B
A collection of Systems specific standards
MPEG-C
A collection of Video specific standards
MPEG-D
A collection of Audio specific standards
MPEG-E
A standard (M3W) providing support to download and execution of multimedia
applications
MPEG-H
A standard (HEVC) providing a significantly increased video compression performance
MPEG-M
A standard (MXM) for packaging and reusability of MPEG technologies
MPEG-U
A standard for rich-media user interface
MPEG-V
A standard for interchange with virtual worlds
Page 6
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
Video Coding Experts Group (VCEG)
Part of study group 16 (Multimedia coding, systems and
applications) of the ITU-T. Established in 1984
H.120
The first digital video coding standard
H.261
Was the first practical digital video coding standard.
H.262
It is identical in content to the video part of the ISO/IEC MPEG-2 standard.
H.263
Provided a suitable replacement for H.261 at all bitrates.
H.263v2
Also known as H.263+, Enhanced robustness against data loss in the transmission channel.
H.264
The ITU-T H.264 standard and the ISO/IEC MPEG-4 Part 10 standard (formally, ISO/IEC 1449610) are technically identical.
H.265
Not yet developed; expected 2012 or later.
H.271
Video back channel messages for conveyance of status information and requests from a video
receiver to a video sender.
Page 7
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
The Scope of Picture and Video Coding Standardization
A Video standard specifically do not define an encoder; rather, they
define the output that an encoder should produce.
A decoding method is defined in each standard (only the Bitstream
Syntax and Decoding Process are standardized):
e.g. use IDCT, but not how to implement the IDCT
Permits optimization beyond the obvious
Permits complexity reduction for implementability
Provides no guarantees of Quality - only interoperability
Ensuring interoperability:
Enabling communication between
devices made by different
manufacturers
Page 8
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Quality Evaluation
Subjective
A human “subject” rates the video on a scale
Double Stimulus Continuous
Quality Scale Method
Hidden scale of 0-100
Difference is calculated as
the actual rating
Page 9
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Quality Evaluation
Objective
A computer algorithm judges the distortion between
videos
Attempts to model a human observer
There is currently no standard method
Page 10
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Quality Evaluation
Objective Metrics: PSNR
Peak Signal-To-Noise Ratio (PSNR)
Used widely in evaluating coding performance
Purely mathematical difference
Can be tricked quite easily
Root Mean Squared
Error (RMSE)
255 = 2^n – 1
n: the number of bits per image sample
Page 11
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Quality Evaluation
PSNR, Example
Original
Page 12
PSNR 35.4 [dB]
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
PSNR 29.0 [dB]
ABC
AB
Page 13
Original
PSNR 45.53 [dB]
PSNR 36.81 [dB]
PSNR 31.45 [dB]
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Quality Evaluation
ABC
Page 14
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Quality Evaluation
ABC
Page 15
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Reminder
CIF-size image
352×288
Video Quality Evaluation
ABC
Page 16
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Quality Evaluation
ABC
R-D Performance of MPEG Codecs
50
48
46
PSNR (Y)
44
42
40
38
36
34
32
350
450
550
650
750
850
950
Bit rate (kbps)
MPEG-1
Page 17
MPEG-2
MPEG-4
H.264
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
1050
Video Quality Evaluation
ABC
Page 18
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Quality Evaluation
Objective Metrics: PSNR
How to trick PSNR
Take a natural image
Give more bits to areas you look at more
Give less bits to areas you look at less
Subjective rating will be high, PSNR low
Original
Attention Map Example
Test
(High subjective rating, low PSNR)
Page 19
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
H.120
The First Digital Video Coding Standard
ITU-T (ex-CCITT) Rec. H.120: 1984
v1 (1984) had conditional replenishment, DPCM, scalar
quantization, variable-length coding, switch for quincunx sampling
v2 (1988) added motion compensation and background prediction
Operated at 1.544 (NTSC) and 2.048 (PAL) Mbits/s
Few units made, essentially not in use today
Conditional Replenishment: Can signal to leave a block area of the image
unchanged, or replace it with new data (using a threshold value).
Quincunx sampling: In a digital video system, a sampling structure with
an array of samples where alternate rows of pixel samples are displaced
horizontally in the grid by half of the pitch of the pixel samples along the
remaining rows.
Page 20
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
H.261
ITU-T, completed in 1990, The first widespread
practical success
Video telephony and teleconferencing over ISDN
(Integrated Services Digital Network)
Embodying typical structure that dominates today
Combination of DPCM and DCT
Motion Compensation
p x 64kbps (64-2048 kbps)
Still in use, although mostly as a backward-compatibility
feature – overtaken by H.263
Page 21
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
MPEG-1
MPEG-1: “Coding of moving pictures and associated audio for
digital storage media” (1992)
Group of Pictures, Motion Estimation and Motion Compensation,
Differential Coding, DCT, Quantization, Entropy Coding
Video on digital storage media (CD-ROM)
Target was VHS Quality at 1.5MBits/s (at 352x240 resolution)
Basis of Video-CD
MP3 (MPEG-1 Layer 3)
16 bits
Samping rate - 32, 44.1, or 48 kHz
Bitrate – 32 to 320 kbps
De facto - 44.1 kHz sample rate, 192 kbps bitrate
Page 22
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
MPEG-1
Only supports progressive pictures
Adds bi-directional motion prediction to H.261 design
Adds half-pixel motion estimation (See next slide)
Slice-structured coding
DC-only “D” pictures
Superior quality to H.261 when operated a higher bit
rates ( > 1 Mbps for CIF 352x288 resolution)
Now mostly overtaken by MPEG-2
Page 23
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
MPEG-1, Half-Pixel ME
Half-Pixel (coarse-fine) Motion Estimation Algorithm
1) Coarse step: Perform integer motion estimation on blocks; find best integer-pixel MV
2) Fine step: Refine estimate to find best half-pixel MV
a) Spatially interpolate the selected region in reference frame
b) Compare current block to interpolated reference frame block
c) Choose the integer or half-pixel offset that provides best match
Typically, bilinear
interpolation is used for
spatial interpolation
Page 24
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
MPEG-2
MPEG-2: “Generic coding of Moving Pictures and
Associated Audio”
Broadcasting and storage
Satellite TV, DVD, Digital TV
Ubiquity in hardware implies that it will be here for a
long time
Transition to HDTV has taken over 10 years and is not finished yet
Different profiles and levels allow for quality control
Bitrates: Typ. 4-9 MBits/s (Not especially useful below 4 Mbps,
normal range of use 5-30 Mbps)
Page 25
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
MPEG-2
Support for interlaced scan, various picture
sampling formats, user defined quantization
matrix
Essentially same as MPEG-1 for progressivescan pictures
Various forms of scalability (SNR, Spatial,
Temporal and hybrid)
Base Layer: Basic quality requirement, For SDTV
Enhanced Layer: High quality service, For HDTV
Page 26
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
MPEG-2 Profiles and Levels
Goal: To enable more efficient implementations for different
applications (interoperability points)
Profile: Subset of the tools applicable for a family of applications
Level: Bounds on the complexity for any profile
Page 27
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
Bitrate allocation
CBR – Constant BitRate
Streaming media uses this
Easier to implement
VBR – Variable BitRate
DVD’s use this
Usually requires 2-pass coding
Allocate more bits for complex scenes
This is worth it, because you assume that you encode
once, decode many times
Page 28
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
MPEG Container Format
Container format is a file format that can
contain data compressed by standard codecs
2 types for MPEG
Program Stream (PS) – Designed for
reasonably reliable media, such as disks
Transport Stream (TS) – Designed for lossy
links, such as networks or broadcast antennas
Page 29
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
MPEG-3 ?
Originally developed for HDTV, but abandoned when
MPEG-2 was determined to be sufficient
Page 30
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
H.263
ITU-T Rec. H.263 (v1: 1995): The next generation of
video coding performance, developed by ITU-T – the
current premier ITU-T video standard (has overtaken
H.261 as dominant videoconferencing codec)
Video telephony over PSTN (public switched telephone
network)
Wins by a factor of two at very low rates
Version 2 (late 1997 / early 1998) & version 3 (2000)
later developed with a large number of new features
H.263+ & H.263++ (Extensions to H.263)
Page 31
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
MPEG-4
MPEG-4: “Coding of audio-visual objects”
Started as very low-bitrate project
Contains the H.263 baseline design and adds many
creative new extras:
Coding of media objects (Segmented coding of shapes)
Bitrate: variable
Synthetic/Semi-synthetic objects
XMT: Like HTML, but to build videos
First standard with Intellectual Property Management
Page 32
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Part Number
Title
Description
Video Coding Standards
ISO/IEC 14496-1 Systems
Describes synchronization and multiplexing of video and audio. For example Transport
stream.
Part 2
ISO/IEC 14496-2 Visual
A compression codec for visual data (video, still textures, synthetic images, etc.). One of
the many "profiles" in Part 2 is the Advanced Simple Profile (ASP).
Part 3
ISO/IEC 14496-3 Audio
A set of compression codecs for perceptual coding of audio signals, including some
variations of Advanced Audio Coding (AAC) as well as other audio/speech coding tools.
Part 4
ISO/IEC 14496-4 Conformance
Describes procedures for testing conformance to other parts of the standard.
Part 5
ISO/IEC 14496-5 Reference Software
Provides software for demonstrating and clarifying the other parts of the standard.
Part 6
Delivery Multimedia
ISO/IEC 14496-6 Integration Framework
(DMIF).
Part 7
ISO/IEC 14496-7
Part 8
ISO/IEC 14496-8 Carriage on IP networks Specifies a method to carry MPEG-4 content on IP networks.
Part 9
ISO/IEC 14496-9 Reference Hardware
Part 10
ISO/IEC 1449610
Part 1
MPEG-4
Optimized Reference
Software
Provides examples of how to make improved implementations (e.g., in relation to Part
5).
Provides hardware designs for demonstrating how to implement the other parts of the
standard.
Advanced Video Coding
A codec for video signals which is technically identical to the ITU-T H.264 standard.
(AVC)
http://en.wikipedia.org/wiki/MPEG-4
Page 33
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
MPEG-4, Object Based Coding
Extension of MPEG-1/2-type algorithms to code arbitrarily shaped objects
Basic Idea: Extend Block-DCT and Block-ME/MCprediction to code arbitrarily shaped objects
Page 34
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
[MPEG Committee]
Video Coding Standards
MPEG-4, Sprite Coding
Sprite: Large background image
Hypothesis: Same background exists for many
frames, changes resulting from camera motion and
occlusions
One possible coding strategy:
1. Code & transmit entire sprite once
2. Only transmit camera motion parameters for each
subsequent frame
Significant coding gain for some scenes
Page 35
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
MPEG-4, Sprite Coding
[MPEG Committee]
Page 36
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards
H.264 or MPEG-4 Part 10 (AVC)
MPEG-4 Part 10: Advanced Video Coding / H.264
Designed by a Joint MPEG and ITU-T group
Claims 50% bitrate savings to MPEG-2, 30% over MPEG-4!
Bitrate: 10’s to 100’s kb/s
Variable Block Size, Multiple Reference Frames, Integer Transform,
Intra Prediction, In-loop Deblocking Filtering, 1/4-pel Resolution
Motion Estimation, ASO (Arbitrary Slice Ordering), FMO (Flexible
Macroblock Ordering)
Enhanced entropy coding
CAVLC (Context Adaptive Variable Length Coding)
CABAC (Context Adaptive Binary Arithmetic Codes)
Increased complexity relative to prior standards
Page 37
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards, H.264
Integer Transform
MPEG-4 AVC
MPEG-2, MPEG-4
Page 38
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards, H.264
Variable Block Size
The fixed block size may not be suitable for all motion objects
Improve the flexibility of comparison
Reduce the error of comparison
7 types of blocks for selection
16 x 16
0
8x8
0
Page 39
16 x 8
0
1
8x4
0
1
8 x 16
0
1
4x8
0
8x8
0
1
2
3
4x4
1
0
1
2
3
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards, H.264
Variable Block Size
Residual (without MC) showing optimum choice of partitions
Page 40
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards, H.264
Multiple Reference Frames
The neighboring frames are not the most similar in some cases
The B-frame can be reference frame
B-frame is close to the target frame in many situations
Page 41
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards, H.264
Deblocking Filter
There are severe blocking artifacts
4*4 transforms and block-based motion compensation
Result in bit rate savings of around 6~9%
Improve subjective quality and PSNR of the decoded picture
Without Filter
Page 42
With AVC Deblocking Filter
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Video Coding Standards, H.264
FMO (Flexible Macroblock Ordering)
Slice (composed in FMO)  Enhance Robustness to data loss
Subdivision of a picture into
slices when not using FMO
Page 43
Subdivision of a QCIF frame into
slices when utilizing FMO
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
H.264, Profiles
http://en.wikipedia.org/wiki/MPEG-4_AVC.htm
ABC
ABC
Page 44
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
H.264, Profiles
ABC
ABC
Page 45
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Scalable Video Coding
Motivation
Basic situation:
1. Diverse receivers may request the same video
Different bandwidths, spatial resolutions, frame rates, computational capabilities
2. Heterogeneous networks and a priori unknown network conditions
Wired and wireless links, time-varying bandwidths
When you originally code the video you don’t know which client or network situation
will exist in the future
Probably have multiple different situations, each requiring a different compressed
bitstream
Need a different compressed video matched to each situation
Possible solutions:
1. Compress & store MANY different versions of the same video
2. Real-time transcoding (e.g. decode/re-encode)
3. Scalable coding
Page 46
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Scalable Video Coding
Type of Scalability
The basic types of scalability in video coding
Page 47
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Scalable Video Coding
Temporal Scalability
Based on the use of B-frames to refine the temporal resolution
B-frames are dependent on other frames
However, no other frame depends on a B-frame
Each B-frame may be discarded without affecting other frames
Page 48
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Scalable Video Coding
Spatial Scalability
Based on refining the spatial resolution
Base layer is low resolution version of video
Enhanced (Enh1) contains coded difference between
upsampled base layer and original video
Also called: Pyramid coding
Page 49
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Scalable Video Coding
Quality Scalability
Based on refining the amplitude resolution
Base layer uses a coarse quantizer
Enh1applies a finer quantizer to the difference between the original DCT
coefficients and the coarsely quantized base layer coefficients
Also called: SNR Scalability
Note: Base & enhancement layers
are at the same spatial resolution
Page 50
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
Multimedia Systems
Video III (Video Coding Standards)
Thank You
Next Session: Multimedia Networks I
FIND OUT MORE AT...
1. http://ce.sharif.edu/~m_amiri/
2. http://www.dml.ir/
Page 51
Multimedia Systems, Spring 2011, Mahdi Amiri, Video III