Multi-view Video Coding

Multi-view Video Coding
Hoda Roodaki
[email protected]
Multi-view/3D Video
• Multi-view and 3D video representations require
multiple
synchronized video signals that show the same scenery from different
viewpoints.
• Huge amount of data with need to be compressed efficiently.
2
Multi-View Video Coding (MVC)
• Joint Video Team of the ITU-T Video Coding Experts Group (VCEG) and
the ISO/IEC Moving Picture Experts Group (MPEG) standardize an
extension of H.264/MPEG-4 that is referred to as Multi-view Video
Coding (MVC)
• MVC provides a compact representation for multiple views of a video scene,
stereo-paired video for 3-D viewing.
• The stereo high profile of the MVC extension is inroduced as the coding
format for 3-D video with high-definition resolution
• The MVC is more challenging as the decoder output may contain more than
one view and can consist of any combination of the views
Multiview Scenarios and Applications
Multiview Scenarios and Applications
• Free-viewpoint video, the viewpoint can be interactively changed
• There exist several candidate views for the viewer, one of them is selected as
the target view
• Decoder focus on decoding target view
 Efficient switching between different view
• 3-D TV, more than one view is decoded and display simultaneously
• Stereoscopic video
• Classic stereo systems that require special-purpose glasses
• Auto-stereoscopic displays that do not require glasses
• 3-D video
• Multiple actual or rendered views of the scene are presented to the viewer, e.g.
using ‘virtual reality’ glasses or an advanced auto-stereoscopic display, so that
view changes with head movements and the viewer has the feeling of immersion
in the 3-D
 Parallel processing of different views and flexible stream adaption
5
Multiview Scenarios and Applications
• Teleconference applications
• Both interactivity and virtual reality
• Rendering of 3-D TV content or view synthesis
• Depth information is needed
• 2-D TV or HDTV application are still dominating the market
 MVC content should provide a way for those 2-D decoders to generate a display
from an MVC bitstream
6
Standardization Requirements
• High compression efficiency
•
•
•
•
Huge amount of data in MVC
Enable Inter-view prediction
Efficient memory management of decoded pictures
Significant gain compared to independent compression of each view
• Random access
• Ensure that any image can be accessed, decoded, and displayed by starting
the decoder at a random access point and decoding a relatively small
quantity of data on which that image may depend
• Insertion intra coded pictures
• View-switching random access
7
Standardization Requirements
• Typical MVC prediction structure
IDR
anchor
8
Standardization Requirements
• Scalability
• The ability of a decoder to access only a portion of a bitstream still being able to
generate effective video output – reduce temporal or spatial resolution
• View-scalability
• Adaption of user preference, network bandwidth, decoder complexity
• Decoder resource consumption
• A number of views are to be decoded and display
• Optimal decoder in terms of memory and complexity is very important to make
real-time decoding possible
• Parallel processing
• In 3-D TV, multiple views need to be decoded simultaneously
• Reduce computation time to achieve real-time decoding
9
Extending H.264/MPEG-4 AVC for Multiview
• Enabling Inter-View Prediction
• Exploit both temporal and spatial redundancy
• The flexible reference picture management capabilities that had already
been designed into H.264/MPEG-4 AVC
• Making the decoded pictures from other views available in the reference picture
lists for use by the inter-picture prediction processing
• MVC design does not allow the prediction of a picture in one view at a given time
using a picture from another view at different time
• Inter-view prediction may be used for encoding the non-base view
10
Extending H.264/MPEG-4 AVC for Multiview
• Profiles and Levels
• Profiles
• Determine the subset of coding tools that must be supported by conforming decoders
• Based on the high profile of H.264/MPEG-4 AVC
• Multiview high profile
• Supports multiple views
• Stereo high profile
• Two views
• Levels
• Constrains on the bitstreams produced by MVC encoders, to establish bounds on the necessary decoder
resources and complexity
•
•
•
•
Limit on the amount of frame memory required for the decoding of a bitstream
The maximum throughput in terms of macroblocks per second
Maximum picture size
Overall bitrate
3D Technologies Types
• Types of 3D Displays
• Stereoscopic: Provides a different image to the viewer's left and right eyes (generally user has to
use special spectacles).
• Autostereoscopic: Uses optical components in the display, rather than worn by the user, to enable
each eye to see a different image.
Stereoscopic
• Stereoscope
• A stereoscope is composed of two pictures mounted next to each other, and a set of lenses to view the
pictures through. Each picture is taken from a slightly different viewpoint that corresponds closely to the
spacing of the eyes. The left picture represents what the left eye would see, and likewise for the right
picture. When observing the pictures through a special viewer, the pair of two-dimensional pictures
merge together into a single three-dimensional photograph.
Stereo Video
• Stereo video requires two views for each eyes.
14
Multi-view Video Coding
15
New Research Area
• Quality Assessment
• Objective Quality Assessment
PSNR = 24.9 dB
for all the images
MSSIM=0.9168
MSSIM=0.7052
MSSIM=0.6949
16
New Research Area
• Quality Assessment
• Objective Quality Assessment
Random “Analog” Noise
MSE = 27.10
Blocky “Digital” Noise
MSE = 21.26
Measures like MSE suitable for Analog noise no longer work for Digital noise
17
New Research Area
• Subjective Quality Assessment
18