Course Presentation Multimedia Systems Video III (Video Coding Standards) Mahdi Amiri April 2011 Sharif University of Technology Video Coding Standards Standardization Organizations Two organizations have dominated video compression standardization ITU-T Video Coding Experts Group (VCEG) International Telecommunications Union –Telecommunications Standardization Sector (ITU-T, a United Nations Organization, formerly CCITT), Study Group 16, Question 6 ISO/IEC Moving Picture Experts Group (MPEG) International Standardization Organization and International Electrotechnical Commission, Joint Technical Committee Number 1, Subcommittee 29, Working Group 11 Page 1 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards Dynamics VCEG is older and more focused on conventional (esp. low-delay) video coding goals (e.g. good compression and packet-loss/error resilience) MPEG is larger and takes on more ambitious goals (e.g. “object oriented video”, “synthetic-natural hybrid coding”, and digital cinema) Sometimes the major organizations team up (e.g. ISO, IEC and ITU teamed up for both MPEG-2 and JPEG) Relatively little industry consortium activity (DV and organizations that tweak the video coding standards in minor ways, such as DVD, 3GPP, 3GPP2, SMPTE, IETF, etc.) Growing activity for internet streaming media outside of formal standardization (e.g., Microsoft, Real Networks, Quicktime) Page 2 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Gary J. Sullivan, Ph.D. ITU-T VCEG Rapporteur | Chair ISO/IEC MPEG Video Rapporteur | Co-Chair ITU/ISO/IEC JVT Rapporteur | Co-Chair Video Coding Standards Page 3 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards Page 4 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards Moving Picture Experts Group (MPEG) A working group of ISO/IEC in charge of the development of standards for coded representation of digital audio and video and related data. Established in 1988 23 years of activity The number of independent standards: more than 125 Page 5 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III MPEG-1 The standard on which such products as Video CD and MP3 are based MPEG-2 The standard on which such products as Digital Television set top boxes and DVD are based; MPEG-4 The standard for multimedia for the fixed and mobile web; MPEG-7 The standard for description and search of audio and visual content; Video Coding Standards MPEG-21 The Multimedia Framework; MPEG-A The standard providing application-specific formats by integrating multiple MPEG technologies; MPEG-B A collection of Systems specific standards MPEG-C A collection of Video specific standards MPEG-D A collection of Audio specific standards MPEG-E A standard (M3W) providing support to download and execution of multimedia applications MPEG-H A standard (HEVC) providing a significantly increased video compression performance MPEG-M A standard (MXM) for packaging and reusability of MPEG technologies MPEG-U A standard for rich-media user interface MPEG-V A standard for interchange with virtual worlds Page 6 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards Video Coding Experts Group (VCEG) Part of study group 16 (Multimedia coding, systems and applications) of the ITU-T. Established in 1984 H.120 The first digital video coding standard H.261 Was the first practical digital video coding standard. H.262 It is identical in content to the video part of the ISO/IEC MPEG-2 standard. H.263 Provided a suitable replacement for H.261 at all bitrates. H.263v2 Also known as H.263+, Enhanced robustness against data loss in the transmission channel. H.264 The ITU-T H.264 standard and the ISO/IEC MPEG-4 Part 10 standard (formally, ISO/IEC 1449610) are technically identical. H.265 Not yet developed; expected 2012 or later. H.271 Video back channel messages for conveyance of status information and requests from a video receiver to a video sender. Page 7 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards The Scope of Picture and Video Coding Standardization A Video standard specifically do not define an encoder; rather, they define the output that an encoder should produce. A decoding method is defined in each standard (only the Bitstream Syntax and Decoding Process are standardized): e.g. use IDCT, but not how to implement the IDCT Permits optimization beyond the obvious Permits complexity reduction for implementability Provides no guarantees of Quality - only interoperability Ensuring interoperability: Enabling communication between devices made by different manufacturers Page 8 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Quality Evaluation Subjective A human “subject” rates the video on a scale Double Stimulus Continuous Quality Scale Method Hidden scale of 0-100 Difference is calculated as the actual rating Page 9 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Quality Evaluation Objective A computer algorithm judges the distortion between videos Attempts to model a human observer There is currently no standard method Page 10 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Quality Evaluation Objective Metrics: PSNR Peak Signal-To-Noise Ratio (PSNR) Used widely in evaluating coding performance Purely mathematical difference Can be tricked quite easily Root Mean Squared Error (RMSE) 255 = 2^n – 1 n: the number of bits per image sample Page 11 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Quality Evaluation PSNR, Example Original Page 12 PSNR 35.4 [dB] Multimedia Systems, Spring 2011, Mahdi Amiri, Video III PSNR 29.0 [dB] ABC AB Page 13 Original PSNR 45.53 [dB] PSNR 36.81 [dB] PSNR 31.45 [dB] Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Quality Evaluation ABC Page 14 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Quality Evaluation ABC Page 15 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Reminder CIF-size image 352×288 Video Quality Evaluation ABC Page 16 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Quality Evaluation ABC R-D Performance of MPEG Codecs 50 48 46 PSNR (Y) 44 42 40 38 36 34 32 350 450 550 650 750 850 950 Bit rate (kbps) MPEG-1 Page 17 MPEG-2 MPEG-4 H.264 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III 1050 Video Quality Evaluation ABC Page 18 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Quality Evaluation Objective Metrics: PSNR How to trick PSNR Take a natural image Give more bits to areas you look at more Give less bits to areas you look at less Subjective rating will be high, PSNR low Original Attention Map Example Test (High subjective rating, low PSNR) Page 19 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards H.120 The First Digital Video Coding Standard ITU-T (ex-CCITT) Rec. H.120: 1984 v1 (1984) had conditional replenishment, DPCM, scalar quantization, variable-length coding, switch for quincunx sampling v2 (1988) added motion compensation and background prediction Operated at 1.544 (NTSC) and 2.048 (PAL) Mbits/s Few units made, essentially not in use today Conditional Replenishment: Can signal to leave a block area of the image unchanged, or replace it with new data (using a threshold value). Quincunx sampling: In a digital video system, a sampling structure with an array of samples where alternate rows of pixel samples are displaced horizontally in the grid by half of the pitch of the pixel samples along the remaining rows. Page 20 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards H.261 ITU-T, completed in 1990, The first widespread practical success Video telephony and teleconferencing over ISDN (Integrated Services Digital Network) Embodying typical structure that dominates today Combination of DPCM and DCT Motion Compensation p x 64kbps (64-2048 kbps) Still in use, although mostly as a backward-compatibility feature – overtaken by H.263 Page 21 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards MPEG-1 MPEG-1: “Coding of moving pictures and associated audio for digital storage media” (1992) Group of Pictures, Motion Estimation and Motion Compensation, Differential Coding, DCT, Quantization, Entropy Coding Video on digital storage media (CD-ROM) Target was VHS Quality at 1.5MBits/s (at 352x240 resolution) Basis of Video-CD MP3 (MPEG-1 Layer 3) 16 bits Samping rate - 32, 44.1, or 48 kHz Bitrate – 32 to 320 kbps De facto - 44.1 kHz sample rate, 192 kbps bitrate Page 22 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards MPEG-1 Only supports progressive pictures Adds bi-directional motion prediction to H.261 design Adds half-pixel motion estimation (See next slide) Slice-structured coding DC-only “D” pictures Superior quality to H.261 when operated a higher bit rates ( > 1 Mbps for CIF 352x288 resolution) Now mostly overtaken by MPEG-2 Page 23 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards MPEG-1, Half-Pixel ME Half-Pixel (coarse-fine) Motion Estimation Algorithm 1) Coarse step: Perform integer motion estimation on blocks; find best integer-pixel MV 2) Fine step: Refine estimate to find best half-pixel MV a) Spatially interpolate the selected region in reference frame b) Compare current block to interpolated reference frame block c) Choose the integer or half-pixel offset that provides best match Typically, bilinear interpolation is used for spatial interpolation Page 24 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards MPEG-2 MPEG-2: “Generic coding of Moving Pictures and Associated Audio” Broadcasting and storage Satellite TV, DVD, Digital TV Ubiquity in hardware implies that it will be here for a long time Transition to HDTV has taken over 10 years and is not finished yet Different profiles and levels allow for quality control Bitrates: Typ. 4-9 MBits/s (Not especially useful below 4 Mbps, normal range of use 5-30 Mbps) Page 25 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards MPEG-2 Support for interlaced scan, various picture sampling formats, user defined quantization matrix Essentially same as MPEG-1 for progressivescan pictures Various forms of scalability (SNR, Spatial, Temporal and hybrid) Base Layer: Basic quality requirement, For SDTV Enhanced Layer: High quality service, For HDTV Page 26 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards MPEG-2 Profiles and Levels Goal: To enable more efficient implementations for different applications (interoperability points) Profile: Subset of the tools applicable for a family of applications Level: Bounds on the complexity for any profile Page 27 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards Bitrate allocation CBR – Constant BitRate Streaming media uses this Easier to implement VBR – Variable BitRate DVD’s use this Usually requires 2-pass coding Allocate more bits for complex scenes This is worth it, because you assume that you encode once, decode many times Page 28 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards MPEG Container Format Container format is a file format that can contain data compressed by standard codecs 2 types for MPEG Program Stream (PS) – Designed for reasonably reliable media, such as disks Transport Stream (TS) – Designed for lossy links, such as networks or broadcast antennas Page 29 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards MPEG-3 ? Originally developed for HDTV, but abandoned when MPEG-2 was determined to be sufficient Page 30 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards H.263 ITU-T Rec. H.263 (v1: 1995): The next generation of video coding performance, developed by ITU-T – the current premier ITU-T video standard (has overtaken H.261 as dominant videoconferencing codec) Video telephony over PSTN (public switched telephone network) Wins by a factor of two at very low rates Version 2 (late 1997 / early 1998) & version 3 (2000) later developed with a large number of new features H.263+ & H.263++ (Extensions to H.263) Page 31 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards MPEG-4 MPEG-4: “Coding of audio-visual objects” Started as very low-bitrate project Contains the H.263 baseline design and adds many creative new extras: Coding of media objects (Segmented coding of shapes) Bitrate: variable Synthetic/Semi-synthetic objects XMT: Like HTML, but to build videos First standard with Intellectual Property Management Page 32 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Part Number Title Description Video Coding Standards ISO/IEC 14496-1 Systems Describes synchronization and multiplexing of video and audio. For example Transport stream. Part 2 ISO/IEC 14496-2 Visual A compression codec for visual data (video, still textures, synthetic images, etc.). One of the many "profiles" in Part 2 is the Advanced Simple Profile (ASP). Part 3 ISO/IEC 14496-3 Audio A set of compression codecs for perceptual coding of audio signals, including some variations of Advanced Audio Coding (AAC) as well as other audio/speech coding tools. Part 4 ISO/IEC 14496-4 Conformance Describes procedures for testing conformance to other parts of the standard. Part 5 ISO/IEC 14496-5 Reference Software Provides software for demonstrating and clarifying the other parts of the standard. Part 6 Delivery Multimedia ISO/IEC 14496-6 Integration Framework (DMIF). Part 7 ISO/IEC 14496-7 Part 8 ISO/IEC 14496-8 Carriage on IP networks Specifies a method to carry MPEG-4 content on IP networks. Part 9 ISO/IEC 14496-9 Reference Hardware Part 10 ISO/IEC 1449610 Part 1 MPEG-4 Optimized Reference Software Provides examples of how to make improved implementations (e.g., in relation to Part 5). Provides hardware designs for demonstrating how to implement the other parts of the standard. Advanced Video Coding A codec for video signals which is technically identical to the ITU-T H.264 standard. (AVC) http://en.wikipedia.org/wiki/MPEG-4 Page 33 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards MPEG-4, Object Based Coding Extension of MPEG-1/2-type algorithms to code arbitrarily shaped objects Basic Idea: Extend Block-DCT and Block-ME/MCprediction to code arbitrarily shaped objects Page 34 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III [MPEG Committee] Video Coding Standards MPEG-4, Sprite Coding Sprite: Large background image Hypothesis: Same background exists for many frames, changes resulting from camera motion and occlusions One possible coding strategy: 1. Code & transmit entire sprite once 2. Only transmit camera motion parameters for each subsequent frame Significant coding gain for some scenes Page 35 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards MPEG-4, Sprite Coding [MPEG Committee] Page 36 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards H.264 or MPEG-4 Part 10 (AVC) MPEG-4 Part 10: Advanced Video Coding / H.264 Designed by a Joint MPEG and ITU-T group Claims 50% bitrate savings to MPEG-2, 30% over MPEG-4! Bitrate: 10’s to 100’s kb/s Variable Block Size, Multiple Reference Frames, Integer Transform, Intra Prediction, In-loop Deblocking Filtering, 1/4-pel Resolution Motion Estimation, ASO (Arbitrary Slice Ordering), FMO (Flexible Macroblock Ordering) Enhanced entropy coding CAVLC (Context Adaptive Variable Length Coding) CABAC (Context Adaptive Binary Arithmetic Codes) Increased complexity relative to prior standards Page 37 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards, H.264 Integer Transform MPEG-4 AVC MPEG-2, MPEG-4 Page 38 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards, H.264 Variable Block Size The fixed block size may not be suitable for all motion objects Improve the flexibility of comparison Reduce the error of comparison 7 types of blocks for selection 16 x 16 0 8x8 0 Page 39 16 x 8 0 1 8x4 0 1 8 x 16 0 1 4x8 0 8x8 0 1 2 3 4x4 1 0 1 2 3 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards, H.264 Variable Block Size Residual (without MC) showing optimum choice of partitions Page 40 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards, H.264 Multiple Reference Frames The neighboring frames are not the most similar in some cases The B-frame can be reference frame B-frame is close to the target frame in many situations Page 41 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards, H.264 Deblocking Filter There are severe blocking artifacts 4*4 transforms and block-based motion compensation Result in bit rate savings of around 6~9% Improve subjective quality and PSNR of the decoded picture Without Filter Page 42 With AVC Deblocking Filter Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Video Coding Standards, H.264 FMO (Flexible Macroblock Ordering) Slice (composed in FMO) Enhance Robustness to data loss Subdivision of a picture into slices when not using FMO Page 43 Subdivision of a QCIF frame into slices when utilizing FMO Multimedia Systems, Spring 2011, Mahdi Amiri, Video III H.264, Profiles http://en.wikipedia.org/wiki/MPEG-4_AVC.htm ABC ABC Page 44 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III H.264, Profiles ABC ABC Page 45 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Scalable Video Coding Motivation Basic situation: 1. Diverse receivers may request the same video Different bandwidths, spatial resolutions, frame rates, computational capabilities 2. Heterogeneous networks and a priori unknown network conditions Wired and wireless links, time-varying bandwidths When you originally code the video you don’t know which client or network situation will exist in the future Probably have multiple different situations, each requiring a different compressed bitstream Need a different compressed video matched to each situation Possible solutions: 1. Compress & store MANY different versions of the same video 2. Real-time transcoding (e.g. decode/re-encode) 3. Scalable coding Page 46 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Scalable Video Coding Type of Scalability The basic types of scalability in video coding Page 47 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Scalable Video Coding Temporal Scalability Based on the use of B-frames to refine the temporal resolution B-frames are dependent on other frames However, no other frame depends on a B-frame Each B-frame may be discarded without affecting other frames Page 48 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Scalable Video Coding Spatial Scalability Based on refining the spatial resolution Base layer is low resolution version of video Enhanced (Enh1) contains coded difference between upsampled base layer and original video Also called: Pyramid coding Page 49 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Scalable Video Coding Quality Scalability Based on refining the amplitude resolution Base layer uses a coarse quantizer Enh1applies a finer quantizer to the difference between the original DCT coefficients and the coarsely quantized base layer coefficients Also called: SNR Scalability Note: Base & enhancement layers are at the same spatial resolution Page 50 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III Multimedia Systems Video III (Video Coding Standards) Thank You Next Session: Multimedia Networks I FIND OUT MORE AT... 1. http://ce.sharif.edu/~m_amiri/ 2. http://www.dml.ir/ Page 51 Multimedia Systems, Spring 2011, Mahdi Amiri, Video III
© Copyright 2026 Paperzz