Multi-view Video Coding Hoda Roodaki [email protected] Multi-view/3D Video • Multi-view and 3D video representations require multiple synchronized video signals that show the same scenery from different viewpoints. • Huge amount of data with need to be compressed efficiently. 2 Multi-View Video Coding (MVC) • Joint Video Team of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) standardize an extension of H.264/MPEG-4 that is referred to as Multi-view Video Coding (MVC) • MVC provides a compact representation for multiple views of a video scene, stereo-paired video for 3-D viewing. • The stereo high profile of the MVC extension is inroduced as the coding format for 3-D video with high-definition resolution • The MVC is more challenging as the decoder output may contain more than one view and can consist of any combination of the views Multiview Scenarios and Applications Multiview Scenarios and Applications • Free-viewpoint video, the viewpoint can be interactively changed • There exist several candidate views for the viewer, one of them is selected as the target view • Decoder focus on decoding target view Efficient switching between different view • 3-D TV, more than one view is decoded and display simultaneously • Stereoscopic video • Classic stereo systems that require special-purpose glasses • Auto-stereoscopic displays that do not require glasses • 3-D video • Multiple actual or rendered views of the scene are presented to the viewer, e.g. using ‘virtual reality’ glasses or an advanced auto-stereoscopic display, so that view changes with head movements and the viewer has the feeling of immersion in the 3-D Parallel processing of different views and flexible stream adaption 5 Multiview Scenarios and Applications • Teleconference applications • Both interactivity and virtual reality • Rendering of 3-D TV content or view synthesis • Depth information is needed • 2-D TV or HDTV application are still dominating the market MVC content should provide a way for those 2-D decoders to generate a display from an MVC bitstream 6 Standardization Requirements • High compression efficiency • • • • Huge amount of data in MVC Enable Inter-view prediction Efficient memory management of decoded pictures Significant gain compared to independent compression of each view • Random access • Ensure that any image can be accessed, decoded, and displayed by starting the decoder at a random access point and decoding a relatively small quantity of data on which that image may depend • Insertion intra coded pictures • View-switching random access 7 Standardization Requirements • Typical MVC prediction structure IDR anchor 8 Standardization Requirements • Scalability • The ability of a decoder to access only a portion of a bitstream still being able to generate effective video output – reduce temporal or spatial resolution • View-scalability • Adaption of user preference, network bandwidth, decoder complexity • Decoder resource consumption • A number of views are to be decoded and display • Optimal decoder in terms of memory and complexity is very important to make real-time decoding possible • Parallel processing • In 3-D TV, multiple views need to be decoded simultaneously • Reduce computation time to achieve real-time decoding 9 Extending H.264/MPEG-4 AVC for Multiview • Enabling Inter-View Prediction • Exploit both temporal and spatial redundancy • The flexible reference picture management capabilities that had already been designed into H.264/MPEG-4 AVC • Making the decoded pictures from other views available in the reference picture lists for use by the inter-picture prediction processing • MVC design does not allow the prediction of a picture in one view at a given time using a picture from another view at different time • Inter-view prediction may be used for encoding the non-base view 10 Extending H.264/MPEG-4 AVC for Multiview • Profiles and Levels • Profiles • Determine the subset of coding tools that must be supported by conforming decoders • Based on the high profile of H.264/MPEG-4 AVC • Multiview high profile • Supports multiple views • Stereo high profile • Two views • Levels • Constrains on the bitstreams produced by MVC encoders, to establish bounds on the necessary decoder resources and complexity • • • • Limit on the amount of frame memory required for the decoding of a bitstream The maximum throughput in terms of macroblocks per second Maximum picture size Overall bitrate 3D Technologies Types • Types of 3D Displays • Stereoscopic: Provides a different image to the viewer's left and right eyes (generally user has to use special spectacles). • Autostereoscopic: Uses optical components in the display, rather than worn by the user, to enable each eye to see a different image. Stereoscopic • Stereoscope • A stereoscope is composed of two pictures mounted next to each other, and a set of lenses to view the pictures through. Each picture is taken from a slightly different viewpoint that corresponds closely to the spacing of the eyes. The left picture represents what the left eye would see, and likewise for the right picture. When observing the pictures through a special viewer, the pair of two-dimensional pictures merge together into a single three-dimensional photograph. Stereo Video • Stereo video requires two views for each eyes. 14 Multi-view Video Coding 15 New Research Area • Quality Assessment • Objective Quality Assessment PSNR = 24.9 dB for all the images MSSIM=0.9168 MSSIM=0.7052 MSSIM=0.6949 16 New Research Area • Quality Assessment • Objective Quality Assessment Random “Analog” Noise MSE = 27.10 Blocky “Digital” Noise MSE = 21.26 Measures like MSE suitable for Analog noise no longer work for Digital noise 17 New Research Area • Subjective Quality Assessment 18
© Copyright 2026 Paperzz