IBC2006 Paper Proposal

PERCEPTUAL QUALITY BASED RATE CONTROL METHOD
FOR AVC-INTRA
Madan mohan. M
Tata Elxsi, India
ABSTRACT
AVC–Intra specification is aimed at high quality and high definition
broadcast products. The AVC – Intra specification specifies compression
constraints and bit stream syntax on the High 10 Intra profile and the high
4:2:2 Intra profile. It has Class 100 and Class 50 mode, which demands
very accurate rate control algorithm since each intra frame has to be
encoded at exact same number of coded bytes per frame as specified in
SMPTE RP 2027–2007. Need for very good perceptual quality and also
accurate bit-rate control exist in application using this profile, which
requires perceptual quality based rate control algorithm method with
accurate bit-rate control. The paper discusses about perceptual quality
based rate control method in AVC –Intra, which could be easily extended
for other video coding methods as well.
INTRODUCTION
PSNR metrics is widely used for representing quality for video encoders but they don’t fully
represent visual quality perceived by a human. HEVC standardization has considered
subjective quality testing as one of the main metric for evaluation of proposals.
Any lossy video compression introduces noise in encoded video depending on the bit rate
and compression efficiency of the standard used. Though noise introduced is uniform in a
frame, perceivable noise by a human varies based on the characteristics of the video
content.
Subjective quality based encoder algorithms give improved visual quality perceived by an
end user. Such encoder could conceal more distortions in the regions, where noise is less
perceivable by human and reduce distortions in regions, where noise is more perceivable
by human.
Encoder algorithms could be tuned for better visual quality based on subjective quality
analysis on the encoded content. Rate control is one of the key encoder algorithms, which
could be used to improve visual quality based on human vision. Lot of research has been
done subjective quality based video coding and there are different methods used (5, 6)
The paper focuses mainly on all intra based encoding used in AVC-INTRA Class 100,
where bits per frames should be constant always and visual quality is of higher
importance. Though the proposed model could be used in long GOP and low latency video
encoder as well, we limit our discussion to AVC-Intra for simplicity.
Paper discusses on below items in following sections,
1.
2.
3.
4.
Simplified Linear RD model.
Perceptual quality based Bit allocation
MB level Qp modulation
Quality benchmarking
AVC-INTRA CLASS 100
AVC-Intra class 100 provides high image quality, including greater encoding efficiency.
The AVC Intra Class 100 specification shall be compliant with the AVC standard using
High 4:2:2 [email protected]. Coded frame size excluding the metadata in SEI message is
constant for a particular resolution and frame rate as specified in SMPTE RP 2027–2007.
For example, coded frame size (excluding metadata) of 462,848 bytes per frame is used
for 1920x1080 resolution (29.97 I-frames/s for 59.94i and 29.97p). Coded frame size
should always be kept constant and it should not vary in any encoded frame even by a bit
more or less.
The specification also has restriction on the number of slices per frame used, transform
size, entropy coding frame structure, scaling matrix, intra prediction and de-blocking filter
usage. Such restriction on the encoder side gives higher dependency on rate control
algorithm for quality as other tool set could not be tuned for subjective quality. Since the
coded frame size should be constant for all encoded frames, bit-rate accuracy of rate
control model is critical. There is also no possibility of improving subjective quality by
introducing user defined scaling matrix. Subjective quality improvement for AVC-Intra
highly depends on rate control algorithm in encoder.
Panasonic P2 card encoder provides an excellent subjective quality for various AVC-Intra
configurations. We used only AVC-Intra Class 100, 1080i resolution at 59.94fps for our
quality benchmarking.
Simplified Linear RD Model
Rate control method used in JM reference model uses a quadratic RD model, assuming
the source statistics are Laplacian distributed.
R-D functions in expanded Taylor series as derived by Viteribi and Omura in (2)
( )
(
)
(
)
(
)
( )
( )
R-D Model proposed by H.J. Lee et al in (1) based on observation in (2),
R-D Model was further enhanced incorporating video complexity function (1)
Where,
– Total number of encoded bits for the current (
) frame
- Header bits used in the current frame
- MAD, computed using residue of luma component
- Quantization level used in current frame
&
are model parameters
Instead of using quadratic model, we propose Simplified Linear RD Model (SLR Model)
(
)
(
)
Since AVC-INTRA is a high bit-rate application, header bits would be negligible compared
to texture bits. The above equation could be approximated as below
(
)
&
are model parameters obtained from frame level statistics collected from
previously encoded frames.
Comparison between Simplified Linear RD model and JM Model
JM reference software version 18.0 is used for our analysis. RC_MODE_1 is used for all I
frames encoding in JM encoder.
RC_MODE_1 uses Quadratic RD Model for a frame level rate control. Per frame bit target
has been forced to a constant value similar to AVC-INTRA for our bit-rate control accuracy
analysis.
Same JM reference software version has been updated with Simplified Linear RD model
for our experiment.
JM RD model and Simplified Linear RD model was tested for various standard cif
resolution sequences at 5Mbps. Similar improvement in accuracy are observed at other
bitrates as well. Improvement accuracy in meeting the target bits per frames directly
improves quality in the frame.
Figure 1 – bridgeclose
Figure 2 – bus
Figure 3 – coastguard
Figure 4 – container
Figure 5 – foreman
Figure 6 – flower
Figure 7 – highway
Figure 8 – mobile
Figure 9 – stefan
Figure 10 – paris
Video
sequence
bridgeclose
bus
coastguard
container
flower
foreman
hall
highway
mobile
paris
stefan
Average percentage bits
deviation per frame
JM
SLR
5.41
16.74
14.66
4.24
10.41
6.35
2.68
2.05
5.65
11.98
13.43
4.91
7.01
4.37
4.14
5.9
4.87
2.68
2.08
5.08
2.8
9.47
PSNR Y (in dB)
JM
SLR
40.28
36.81
37.96
42.04
32.97
42.19
44.39
44.39
31.52
36.96
37.41
40.32
38.08
39.15
42.08
34.04
42.49
44.39
44.39
31.57
38.3
38.4
Table 1 – Proposed SLR vs JM rate control.
Table 1, shows lesser average percentage bits deviation per frame in the proposed SLR
model compared to JM RD model. There is a overall improvement in PSNR in SLR model
compared to reference JM RD model.
Perceptual Quality Based Bit allocation
Each frame is classified into different perceptual groups based on human eye sensitivity.
Earlier work on perceptual bit allocation scheme proposed by Hongtao Yu et al (5)
classifies frame into texture, edge and background regions. Lagrangian multiplier used in
RDO based mode decision is varied for different perceptual regions. Lagrangian multiplier
is lower for perceptually important region, thereby allowing mode with lower distortion and
higher bits to be used for coding and higher Lagrangian multiplier for perceptually less
important region. Lagrangian multiplier based scheme allows only minimal control on
perceptual quality as it could modify only the mode decision.
Visual distortion based perceptual bit allocation was done by Chih-Wei Tang et al (7). In
their work, Visual distortion sensitivity based Qp adjustment is done over the initial Qp
assigned by rate control. Visual based Qp adjustment doesn’t consider bit-rate deviation,
which could happen due to Qp adjustment. PSNR degradation is observed in their method
with improvement in subjective quality
We propose a new distortion based perceptual bit allocation method, which allows different
frame Qps to be chosen for different perceptual regions by rate control algorithm itself
instead of adjusting Qp after rate control process. Since rate control algorithm assigns
perceptual Qp for different regions, bit-rate control with the proposed method is better. The
method allows better control on perceptual quality, lower Qp is chosen in regions or MBs
with high perceptual importance and higher Qp is chosen in regions with lesser perceptual
importance.
In our method, every frame is classified as smooth, edge and texture groups similar to
earlier work (5).
Modified bit allocation calculation
∑
Where,
– Sum of SAD for MBs present in
group in previous frame
– Modified complexity function for
group
– Perceptual weight for
group
- Total modified complexity function
– Perceptually modified bits allocated for
group
(SAD - sum of absolute difference between predicted and current MB)
takes 0, 1 and 2, which represents smooth, edge and texture region correspondingly.
Perceptually modified bits allocated per group are calculated as above. From our
experiments we found that the constants are
=20,
=16 and
=8.
Perceptual quality based group Q-step calculation
(
)
Where,
– Q-step used for MBs in
group
&
are model parameters obtained from frame level statistics of SAD, bits and
average Quantization parameter.
is assigned to MBs falling in
group.
obtained gives a better perceptual quality
control and also encoded bits is closer to target bits per frame. Apart from frame level Qp
calculation, we would discuss MB level Qp control in our next section.
MB Level Qp Modulation
MB level Qp Modulation is done to further update frame Qp obtained using Simplified
Linear RD Model to achieve more accurate bit-rate control. MB level Qp adaptation helps
significantly in initial frames, where RD model parameters are not updated.
MB level Qp increment or decrement is done based on percentage of bits consumed and
percentage of MBs coded.
Based on perceptual quality analysis, it is observed that fixed MB level Qp equation
doesn’t really give a good perceptual quality. MB level control is kept relaxed at initial rows
and it gets stringent as the percentage of MBs encoded increases.
Our algorithm for MB level Qp control is tuned based on visual analysis after encoding
various Hi vision test sequences. Since the control implementation has got many
constants based on our perceptual analysis, we limit our discussion on this module.
Quality Benchmarking
PSNR comparison
To benchmark our rate control algorithm in AVC-INTRA class 100, 59.94fps, we used
Panasonic P2 card encoder solution. HD Hi-vision sequences were used as test
sequences.
PSNR comparison between Panasonic P2 card and our model shows an overall
improvement in PSNR except 2 frames at scene change. Various ITU HD high-vision test
sequences and NHK Hi vision test sequences (9) were used for our subjective quality
analysis and tuning. We have picked few complex as benchmark test sequences for the
paper.
Figure 11 – PSNR comparison – 16HD
Figure 12 – PSNR comparison – 19HD
Figure 13 – PSNR comparison – 22HD
Figure 14 – PSNR comparison – 23HD
Figure 15 – PSNR comparison – 46HD
Figure 16 – PSNR comparison – 47HD
In the above 6 streams 16HD, 19HD, 22HD, 46HD and 47HD, we see observed upto 1dB
in PSNR also maintaining a closer subjective quality compared to Panasonic P2 card.
Good perceptual and also PSNR improvement were achieved by giving equal importance
to bit-rate control and subjective quality. Reducing the stuff bytes or filler bytes in the
constant Coded bits per frame improves the overall PSNR of the frame. Stuff bytes
reduction is achieved by better RD model, which is well integrated to perceptual model.
Subjective quality comparison
Figure 17 – Proposed Model – 19HD Hi vision sequence snap shot
Figure 18 – Panasonic P2 card – 19HD Hi vision sequence snap shot
Proposed Model and Panasonic P2 card provides similar subjective quality for 19HD.
Visually sensitive regions such as green grass region and red carpet region were given
lower Qp by the model compared to other regions, which improves overall subjective
quality.
Figure 19 – Proposed Model – 46HD Hi vision sequence snap shot
Figure 20 – Panasonic P2 card – 46HD Hi vision sequence snap shot
Proposed Model and Panasonic P2 card provides similar subjective quality for 46HD.
46HD content has a complex water sprinkling at the background, sprinkling intensity
various across frames and hence varying complexity across the frames is observed.
Proposed model achieves uniform quality in dress and face region, though there is a
variation in complexity in the background and also maintains good bit-rate control.
Figure 21 – Proposed Model – 165HD NHK Hi vision sequence snap shot
Figure 22 – Panasonic P2 card – 165HD NHK Hi vision sequence snap shot
Proposed Model and Panasonic P2 card provides similar subjective quality for 165HD as
well. In 165HD content, similar to 46HD variation of complexity is observed in background.
Good uniform perceptual quality is maintained in face and dress regions though the
background complexity varies in time.
CONCLUSIONS
In this paper, a Simplified Linear RD Model for better bit-rate control and also a perceptual
quality based bit allocation method is proposed. This method allows easier control on
subjective quality and at the same time gives better control on bit-rate. The scheme is not
only limited for AVC-Intra based codec, it could be easily adapted for other long GOP and
low latency scenarios.
REFERENCES
1. H.J.Lee and T.H.Chiang and Y.Q.Zhang., 2000 Scalable Rate Control for MPEG-4
Video. IEEE Trans. Circuit Syst. Video Technology, 10: 878-894, 2000.
2. A. Viteribi and J. Omura., 1979 “A new rate control scheme using a new rate-distortion
model,” in Principle of Digital Communication and Coding, New York: McGraw-Hill, 1979
3. ftp://ftp.panasonic.com/pub/panasonic/drivers/PBTS/papers/AVCIntra-FAQs.pdf
4. ftp://ftp.panasonic.com/pub/Panasonic/Drivers/PBTS/papers/WP_AVC-Intra.pdf
5. Hongtao Yu, Feng Pan, Zhiping Lin and Yin Sun., 2005. A perceptual bit allocation
scheme for H.264. IEEE International Conference on Multimedia and Expo
6. Zhicheng Li, Shiyin Qin, Laurent Itti., 2011. Visual attention guided bit allocation in video
compression. Image and Vision Computing, 2011 volume 29, 29(1): 1-14
7. C.-W. Tang, C.-H. Chen, Y.-H. Yu, C.-J. Tsai., 2006. Visual sensitivity guided bit
allocation for video coding, IEEE Trans. Multimedia, volume. 8, Feb. 2006, pp 11–18.
8. ITE/ARIB Hi-Vision Test Sequence 2nd Edition reference manual
ACKNOWLEDGEMENTS
I like to thank my family for encouragement and team for their support.