Applied wavelet techniques in image coding standards

Applied Wavelet Techniques
in Image Coding Standards
DRAGORAD MILOVANOVIC(1), ZORAN BOJKOVIC(2)
(1)
Faculty of Electrical Engineering, University of Belgrade,
Bulevar Revolucije 73, 11120 Belgrade, Serbia and Montenegro
(2)
Faculty of Transport and Traffic Engineering, University of Belgrade,
Vojvode Stepe 305, 11000 Belgrade , Serbia and Montenegro
Abstract: - With the increasing use of multimedia technologies, image compression requires higher
performance as well as new features. To address this needs ISO/IEC JPEG2000 and MPEG-4 VTC image
coding standard has been developed. Lossless and lossy compression, progressive transmission and region-ofinterest coding are vital new features based on advanced multiresolution techniques: integer wavelets, spatially
segmented and line-based wavelet transforms, trellis quantization, binary arithmetic entropy coding and ratedistortion optimization. A standard-based development of wavelet technology is reviewed in the paper, along
with comparative analysis of performances and functionalities.
Key-words: multiresolution techniques, image coding algorithms.
1 Introduction
With development of advanced multimedia
applications (image archiving, network image
transmission,
document
imaging,
medical
imaging,…), image compression requires higher
performance as well as new features. A great effort
has been made to deliver a new standard by
providing features inexistent in previous standards,
but also by providing higher efficiency for features
that exist in others.
JPEG2000 and MPEG-4 VTC are new image
compression standards developed by the ISO/IEC
[1, 2]. The set of new features


lossy and lossless compression,
progressive recovery by pixel accuracy/
resolution,
 tiling,
 region of interest coding,
 random codestream access and processing,
 error resilience,
are take advantages of advanced multiresolution
technologies:





integer wavelets and lifting schemes,
spatially segmented and line-based wavelets,
trellis quantization,
binary arithmetic entropy coding and
rate-distortion optimization.
A JPEG2000 and MPEG-4 VTC advanced features
as well as development of wavelet technology is
presented in the paper, along with comparative
analysis of performances and functionalities.
2 Technology overview
2.1 JPEG2000
The JPEG2000 standard provides a set of new
features that are of vital importance to many
multimedia applications. It addresses areas where
current standards fail to produce the best quality or
performance and provides capabilities to markets that
currently do not use compression [3].
It is desired to provide lossless compression naturally
in the course of progressive coding. JPEG2000
wavelet lossy coder offers performance superior to
the current standards at low bit-rates (e.g. below 0.25
bpp for highly detailed grey-scale images). This
significantly improved low bit-rate performance has
been achieved without sacrificing performance on the
rest of the rate-distortion intervals.
Progressive transmission that allows images to bi
reconstructed with increasing pixel accuracy or
spatial resolution is essential for many applications.
Often there are parts of an image that are more
important than other. This feature allows user defined
Regions-Of-Interest (ROI) in the image to be
randomly (and progressively) accessed and/or
decompressed with less distortion than the rest of the
image. Also, random codestream processing could
allow operations such as rotation, translation,
filtering, feature extraction and scaling.
Portions of the codestream may be more important
than others in determining decoded image quality.
Proper design of the codestream can aid subsequent
error resilience.
The JPEG2000 standard is capable to compressing
and decompressing images with a single sequential
pass (real time coding).
2.2 MPEG-4 VTC
MPEG-4 Visual Texture Coding (VTC) is the
algorithm used in MPEG-4 standard in order to
compress the texture information in photo realistic
3D models. As the texture in a 3D model is similar to
a still picture this algorithm can also be used for
compression of still images. It is based on the
discrete wavelet transform (DWT), scalar
quantization, zero-tree coding and arithmetic coding.
MPEG-4 VTC supports SNR scalability through the
use of different quantization strategies: single
quantization (SQ), multiple quantization (MQ) and
bilevel quantization (BQ). SQ provides no SNR
scalability. MQ provides limited SNR scalability and
BQ provides generic SNR scalability.
Resolution scalability is supported by the use of
band-by-band scanning (BB), instead of traditional
zero-tree scanning (tree-depth), TD), which is also
supported.
MPEG-4 VTC also supports coding of arbitrary
shaped objects, by means of a shape adaptive DWT
but does not support lossless coding. Several objects
can be encoded separately, possibly at different
qualities and then composited at the decoder to
obtain the final decoded image.
3 Wavelet techniques development
At first, in the JPEG2000 coder, the discrete
transform is applied on the source image data. The
transform coefficients are then quantized and entropy
coded, before forming the output bitstream [4]. The
decoder is the reverse of the encoder. Depending on
the wavelet transform and the applied quantization,
the JPEG2000 can be both lossy and lossless.
The original (source) image is partitioned into
rectangular nonoverlapping blocks called tiles
(Figure 1). This is the strongest form of spatial
partitioning, in that all operations are performed
independently on the different tiles of the image. All
tiles have exactly the same dimensions, except
maybe those, which about the right and lower
boundary of the image. Tiling reduces memory
requirements and constitutes one of the methods for
the efficient extraction of a region of the image.
To perform the forward DWT, the standard uses a 1D
subband decomposition of a 1D set of samples into
low-pass samples, and high-pass samples,
representing a downsampled residual version of the
original set, needed for the perfect reconstruction of
the original set from the low-pass set. The DWT can
be irreversible or reversible [5]. The default
irreversible transformation is implemented by means
of the Doubechies 9-tap/7-tap filter. The default
reversible transformation is implemented by means
of the 5-tap/3-tap filter with integer coefficients [6].
The standard supports two filtering models: a
convolution-based and lifting-based. For both modes
the signal should be first extended periodically.
Convolution-based filtering consists in performing a
series of dot products between the two filter masks
and the extended 1D signal. Lifting-based filtering
consists of a sequence of very simple filtering
operations for which alternatively odd sample values
of the signal are updated with a weighted sum of
even sample values. On the other hand, even sample
values are updated with a weighted sum of odd
sample values. For the reversible (lossless) case the
results are rounded to integer values. The liftingbased filtering for the 5/3 analysis filter is achieved
by the following equations
 x (2n)  xest (2n  2)  1
y (2n  1)  xest (2n  1)   est

2


(1)
 y (2n  1)  y (2n  1)  2 
y (2n)  xest (2n)  

4

(2)
where xest is the extended input signal, y is the output
signal, while a and a indicate the largest integer
not exceeding a and the smallest integer not ecceeded
by a, respectively.
Quantization is the process by which the
coefficients are reduced in precision. This operation
is lossy, unless the quantization step is 1 and the
coefficients are integers, as produced by the
reversible integer 5/3 wavelet. Each of the transform
coefficients ab(u,v) of the subband b is quantized to
the value qb(u,v) according to the formula
 ab (u , v) 
qb (u, v)  sign ab (u , v)  

 b 
(3)
where  b is the quantization step. The dynamic
range depends on the number of bits used to
represent the original image tile component and on
the choice of the wavelet transform. All quantized
transform coefficients are signed values even when
the original components are unsigned. These
coefficients are expressed in a sign-magnitude
representation prior to coding. Part 1 of the
JPEG2000 standard uses only simple scalar dead
zone quantization. Part 2 of the standard will
probably contain a trellis coded quantizer [7].
Each subband of the wavelet decomposition is
divided into rectangular blocks, called code-blocks,
which are coded independently using arithmetic
coding. A binary arithmetic entropy coder called the
MQ-coder is used to provide compression of
symbols output by the context model. The
complexity and compression are much higher than
typically used Huffman coder in JPEG. The codeblocks are coded at a bitplane at a time, starting with
the most significant bit-plane with a non-zero
element to the least significant bit-plane. For each
bit-plane in a code-block, a special code-block scan
pattern is used for each of a three passes. Each
coefficient bit in bit-plane is coded in only one of the
three passes. A rate distortion optimization method is
used to allocate a certain number of bits to each
block [8, 9, 10, 11].
The ROI scaling-based coding used, scales up the
coefficients so that the bits associated with the ROI
are placed in higher bit-planes. During the embedded
coding process, those bits are placed in the bitstream,
before the non ROI parts of the image. Thus, the ROI
will be decoded, or refined, before the rest of the
image. Regardless of the scaling, a full decoding of
the bitstream results in a reconstruction of the whole
image with the highest fidelity available. If the
bitstream is truncated, of the encoding process is
terminated before the whole image is fully encoded,
the ROI will have a higher fidelity than the rest of the
image. The ROI approach defined in JPEG2000
Part 1 allows ROI encoding of arbitrary shaped
regions without the need of shape information and
shape decoding [12, 13].
There
are
four
basic
dimensions
of
progression/scalability in JPEG2000 bitstream:
resolution, quality, spatial location and component.
Different types of progression are achieved by the
ordering of packets within the bitstream (Figure 2, 3)
[14]. The JPEG2000 bitstream contains markers
which identify the progression type of the bitstream.
Other markers may be written which store the length
of every packet in the bitstream. To change a
bitstream from progressive by resolution to
progressive by SNR, a parser can read all the
markers, change the type of progression in the
markers, write the lengths of the packets out in the
new order, and write the packets themselves out in
the new order. There is no need to run the MQ-coder,
the context model, or even decode the block
inclusion information. The complexity is only
slightly higher than a pure copy operation.
4 Comparative performances and
functionalities
Compression efficiency is one of the top priorities in
the design of image products. Lossless and lossy
progressive compression efficiency have been
evaluated with 7 images from the JPEG2000 test set,
covering various types of imagery. JPEG2000
provides, in most cases, competitive compression
ratios with the added benefit of scalability. The rate
distortion behavior of the lossy (nonreversible)
JPEG2000 and the progressive JPEG is depicted in
Figure 4 for a natural image. It is seen that the
JPEG2000 significantly outperforms the JPEG
scheme [14].
In order to evaluate the error resilience features
offered by the different standards a transmission
channel with random errors has been simulated in
[15], together with the evaluation of the average
reconstructed image quality after decompression.
Table 3 shows the results for JPEG2000 with nonreversible filter and JPEG baseline. A it can be seen,
the reconstructed image quality under transmission
errors is higher for JPEG2000 then JPEG, across all
encoding bitrates and error rates.
Table 1 sumarizes the comparison of still image
coding algorithms from a functionality point of view.
A functionality matrix indicates the set of supported
features in each standard.
5 Concluding remarks
JPEG2000 and MPEG-4 VTC are the new standards
offering the rich set of features based on
multiresolution techniques (Table 1), in an efficient
manner and within an integrated algorithmic
approach.
The most important technology highlights for JPEG
2000 are: wavelet/subband coding, reversible integerto-integer and nonreversible real-to-real wavelet
transforms, bit-plane coding, arithmetic coding MQ
coder from JBIG2 (ISO/IEC 14492), based heavily
on EBCOT (embedded block coding with optimized
truncation) coding scheme, code stream syntax
similar to JPEG, file format syntax.
Overall, one can say that JPEG2000 is successful
standard that offers the richest set of features and
provides superior rate-distortion performance.
However, this comes at the price of additional
complexity when compared to JPEG, which might be
currently perceived as a disadvantage for some
applications, as was the case for JPEG when it was
first introduced.
References:
[1] K.R.Rao,
Z.Bojkovic,
D.Milovanovic,
Multimedia communication systems: techniques,
standards and networks (Prentice-Hall, 2002).
[2] K.R.Rao,
Z.Bojkovic,
Packet
video
communications over ATM networks (PrenticeHall 2000).
[3] A.N.Skodras, C.A.Christopoulos, T.Ebrahimi,
JPEG2000: The upcoming stil image compression
standard, Proc. RECPA, pp.359-366, 2000.
[4] ISO/IEC JTC1/SC29/WG11 N750, Performance
evaluation of different reversible decorrelating
transforms in JPEG2000 baseline system, 1998.
[5] M.D.Adams, F.Kossentini, Reversible integer-tointeger
wavelet
transforms
for
image
compression: Performance evaluation and
analysis, IEEE Trans. IP, vol.9, no.6, pp.10101024, 2000.
[6] ISO/IEC JTC1/SC29/WG11 N868, Performance
evaluation of spatially segmented wavelet
transform in the JPEG200 baseline system, 1998.
Tiling
[7] P.Sriram, M.W. Marcellin, Image coding using
wavelet transforms and entropy-constrained trellis
quantization, IEEE Trans. IP, vol.4, pp.725-733,
1995.
[8] J.M.Shapiro, Embedded image coding using
zerotrees of wavelet coefficients, IEEE Trans. SP,
vol.41, no.12, pp.3445-3462, 1993.
[9] A.Said, W.Pearlman, A new, fast and efficient
image codec based on set partitioning in
hierarchical trees, IEEE Trans. CSVT, vol.6, no.3,
pp.214-223, 1993.
[10] D.Taubman, A.Zakhor, Multirate 3D subband
coding of video, IEEE Trans. IP, vol.3, no.5,
pp.572-588, 1994.
[11] D.Taubman, High performance scalable image
compression with EBCOT, IEEE Trans. IP,
vol.9, no.7, 2000.
[12] D.S.Cruz, T.Ebrahimi, M.Larsson, J.Askelöf,
C.Cristopoulos, Region of Interest Coding in
JPEG2000
for
interactive
client/server
applications, Proc IEEE Workshop on MSP,
Denmark, pp.389-394, 1999
[13] ISO/IEC JTC1/SC29/WG11 N892, Region of
interest coding, 1998.
[14] ISO/IEC JTC1/SC29/WG11 N1716, Report on
CoreExperiments V1 (Evaluation of the
distortion-adaptive progressive CSF weighting
technique), 2000.
[15] ISO/IEC JTC1/SC29/WG11 N1606, Error
resilience Ad-hoc sub-group report, 2000.
DWT on each tile
0 1
2 3
DC
level
shifting
Image
Component
4 5
6 7
Figure 1. Tiling, DC level shifting and DWT on each image tile
component.
Packet
Header
n0 sub-bitplanes
from code-block 0
n1 sub-bitplanes
from code-block 1
8 9
Figure 2. Twelve code-blocks of one packet
partition location at resolution level 2 of a 3-level
dyadic wavelet transform. The packet partition
location is presented by heavy lines.
n11 sub-bitplanes
from code-block 11
Figure 3. The composition of one packet partition location with 12 code-blocks.
Table 1. Functionality matrix. A + indicates that it is supported, the more+ the more efficiently or better it is
supported. A - indicates that it is not supported.
JPEG2000
Lossless compression performance
Lossy compression performance
Progressive bitstreams
JPEG-LS
JPEG
+++
++++
+
+++++
+
+++
-
2
+++++
Region Of Interest (ROI) coding
MPEG-4 VTC
PNG
-
+++
++++
-
+++
+
1
++
3
-
+++
-
-
+
-
-
-
++
-
Random access
++
-
_
-
-
Low complexity
++
+++++
+++++
+
+++
Error resilience
+++
++
++
+++
+
Non-iterative rate control
+++
-
-
+
-
+++
+++
++
++
+++
Arbitrary shaped objects
4
Genericity
1
2
Only using the lossless mode of JPEG. Only in the progressive mode of JPEG.
Tile-based only.
4
Ability to efficiently compress different types of imagery across a wide range of bitrates.
3
Table 2. Wavelet technologies in JPEG2000 Part 1 and Part 2.
Technology
Bitstream
Part 1
File format
Arithmetic coder
Coefficient
modeling
Quantization
Transformation
Component
decorrelation
Error resilience
Bit-stream ordering
Part 2
Fixed and variable length markers.
Optional. Provide intellectual property (e.g.
copyright) information, color or tone-space for
image, general method of including metadata.
MQ-coder.
Independent coding of fixed size blocks within
subbands. Division of coefficients into 3 subbitplanes. Grouping of sub-bitplanes into layers.
Scalar quantizer with dead-zone, truncation of
code-blocks.
Low complexity (5,3) and high performance
Daubechies (9,7). Mallat decomposition.
New markers can be skipped by a Part 1 decoder.
Reversible component transform (RCT), YcrCb
transform.
Arbitrary point transform or reversible wavelet
transform across components.
Resynchronization markers.
Progressive by tile-part, then SNR, or
resolution, or component.
Fixed length entropy coder, repeated headers.
Figure 4. Rate distortion results for the progressive
JPEG2000 vs. the progressive JPEG for a natural image.
Allow metadata to be interleaved with coded data.
Define types of metadata.
Same?
Special models for binary or graphic data?
Trellis coded quantization.
Many more filters, perhaps user-defined filters.
Packet and other decompositions.
Out of order tile-parts.
Table 3. Average PSNR [dB] of the decoded Café image
transmitted over noisy channel with various bit error rates
(BER) and compression bitrates, for JPEG baseline and
JPEG2000 (J2K).
46
bpp
44
PSNR (dB)
42
0.25
40
38
0.5
36
JPEG2000 NR
P-JPEG
34
1.0
32
30
0
0.5
1
1.5
Bitrate (bpp)
2
2.5
2.0
J2K
JPEG
J2K
JPEG
J2K
JPEG
J2K
JPEG
BER 0
BER 1E-06
BER 1E-05
23.06
21.94
26.71
25.40
31.90
30.34
38.91
37.22
23.00
21.79
26.42
25.12
30.75
29.24
36.38
30.68
21.62
20.77
23.96
22.95
27.08
23.65
27.23
20.78