Applied Wavelet Techniques in Image Coding Standards DRAGORAD MILOVANOVIC(1), ZORAN BOJKOVIC(2) (1) Faculty of Electrical Engineering, University of Belgrade, Bulevar Revolucije 73, 11120 Belgrade, Serbia and Montenegro (2) Faculty of Transport and Traffic Engineering, University of Belgrade, Vojvode Stepe 305, 11000 Belgrade , Serbia and Montenegro Abstract: - With the increasing use of multimedia technologies, image compression requires higher performance as well as new features. To address this needs ISO/IEC JPEG2000 and MPEG-4 VTC image coding standard has been developed. Lossless and lossy compression, progressive transmission and region-ofinterest coding are vital new features based on advanced multiresolution techniques: integer wavelets, spatially segmented and line-based wavelet transforms, trellis quantization, binary arithmetic entropy coding and ratedistortion optimization. A standard-based development of wavelet technology is reviewed in the paper, along with comparative analysis of performances and functionalities. Key-words: multiresolution techniques, image coding algorithms. 1 Introduction With development of advanced multimedia applications (image archiving, network image transmission, document imaging, medical imaging,…), image compression requires higher performance as well as new features. A great effort has been made to deliver a new standard by providing features inexistent in previous standards, but also by providing higher efficiency for features that exist in others. JPEG2000 and MPEG-4 VTC are new image compression standards developed by the ISO/IEC [1, 2]. The set of new features lossy and lossless compression, progressive recovery by pixel accuracy/ resolution, tiling, region of interest coding, random codestream access and processing, error resilience, are take advantages of advanced multiresolution technologies: integer wavelets and lifting schemes, spatially segmented and line-based wavelets, trellis quantization, binary arithmetic entropy coding and rate-distortion optimization. A JPEG2000 and MPEG-4 VTC advanced features as well as development of wavelet technology is presented in the paper, along with comparative analysis of performances and functionalities. 2 Technology overview 2.1 JPEG2000 The JPEG2000 standard provides a set of new features that are of vital importance to many multimedia applications. It addresses areas where current standards fail to produce the best quality or performance and provides capabilities to markets that currently do not use compression [3]. It is desired to provide lossless compression naturally in the course of progressive coding. JPEG2000 wavelet lossy coder offers performance superior to the current standards at low bit-rates (e.g. below 0.25 bpp for highly detailed grey-scale images). This significantly improved low bit-rate performance has been achieved without sacrificing performance on the rest of the rate-distortion intervals. Progressive transmission that allows images to bi reconstructed with increasing pixel accuracy or spatial resolution is essential for many applications. Often there are parts of an image that are more important than other. This feature allows user defined Regions-Of-Interest (ROI) in the image to be randomly (and progressively) accessed and/or decompressed with less distortion than the rest of the image. Also, random codestream processing could allow operations such as rotation, translation, filtering, feature extraction and scaling. Portions of the codestream may be more important than others in determining decoded image quality. Proper design of the codestream can aid subsequent error resilience. The JPEG2000 standard is capable to compressing and decompressing images with a single sequential pass (real time coding). 2.2 MPEG-4 VTC MPEG-4 Visual Texture Coding (VTC) is the algorithm used in MPEG-4 standard in order to compress the texture information in photo realistic 3D models. As the texture in a 3D model is similar to a still picture this algorithm can also be used for compression of still images. It is based on the discrete wavelet transform (DWT), scalar quantization, zero-tree coding and arithmetic coding. MPEG-4 VTC supports SNR scalability through the use of different quantization strategies: single quantization (SQ), multiple quantization (MQ) and bilevel quantization (BQ). SQ provides no SNR scalability. MQ provides limited SNR scalability and BQ provides generic SNR scalability. Resolution scalability is supported by the use of band-by-band scanning (BB), instead of traditional zero-tree scanning (tree-depth), TD), which is also supported. MPEG-4 VTC also supports coding of arbitrary shaped objects, by means of a shape adaptive DWT but does not support lossless coding. Several objects can be encoded separately, possibly at different qualities and then composited at the decoder to obtain the final decoded image. 3 Wavelet techniques development At first, in the JPEG2000 coder, the discrete transform is applied on the source image data. The transform coefficients are then quantized and entropy coded, before forming the output bitstream [4]. The decoder is the reverse of the encoder. Depending on the wavelet transform and the applied quantization, the JPEG2000 can be both lossy and lossless. The original (source) image is partitioned into rectangular nonoverlapping blocks called tiles (Figure 1). This is the strongest form of spatial partitioning, in that all operations are performed independently on the different tiles of the image. All tiles have exactly the same dimensions, except maybe those, which about the right and lower boundary of the image. Tiling reduces memory requirements and constitutes one of the methods for the efficient extraction of a region of the image. To perform the forward DWT, the standard uses a 1D subband decomposition of a 1D set of samples into low-pass samples, and high-pass samples, representing a downsampled residual version of the original set, needed for the perfect reconstruction of the original set from the low-pass set. The DWT can be irreversible or reversible [5]. The default irreversible transformation is implemented by means of the Doubechies 9-tap/7-tap filter. The default reversible transformation is implemented by means of the 5-tap/3-tap filter with integer coefficients [6]. The standard supports two filtering models: a convolution-based and lifting-based. For both modes the signal should be first extended periodically. Convolution-based filtering consists in performing a series of dot products between the two filter masks and the extended 1D signal. Lifting-based filtering consists of a sequence of very simple filtering operations for which alternatively odd sample values of the signal are updated with a weighted sum of even sample values. On the other hand, even sample values are updated with a weighted sum of odd sample values. For the reversible (lossless) case the results are rounded to integer values. The liftingbased filtering for the 5/3 analysis filter is achieved by the following equations x (2n) xest (2n 2) 1 y (2n 1) xest (2n 1) est 2 (1) y (2n 1) y (2n 1) 2 y (2n) xest (2n) 4 (2) where xest is the extended input signal, y is the output signal, while a and a indicate the largest integer not exceeding a and the smallest integer not ecceeded by a, respectively. Quantization is the process by which the coefficients are reduced in precision. This operation is lossy, unless the quantization step is 1 and the coefficients are integers, as produced by the reversible integer 5/3 wavelet. Each of the transform coefficients ab(u,v) of the subband b is quantized to the value qb(u,v) according to the formula ab (u , v) qb (u, v) sign ab (u , v) b (3) where b is the quantization step. The dynamic range depends on the number of bits used to represent the original image tile component and on the choice of the wavelet transform. All quantized transform coefficients are signed values even when the original components are unsigned. These coefficients are expressed in a sign-magnitude representation prior to coding. Part 1 of the JPEG2000 standard uses only simple scalar dead zone quantization. Part 2 of the standard will probably contain a trellis coded quantizer [7]. Each subband of the wavelet decomposition is divided into rectangular blocks, called code-blocks, which are coded independently using arithmetic coding. A binary arithmetic entropy coder called the MQ-coder is used to provide compression of symbols output by the context model. The complexity and compression are much higher than typically used Huffman coder in JPEG. The codeblocks are coded at a bitplane at a time, starting with the most significant bit-plane with a non-zero element to the least significant bit-plane. For each bit-plane in a code-block, a special code-block scan pattern is used for each of a three passes. Each coefficient bit in bit-plane is coded in only one of the three passes. A rate distortion optimization method is used to allocate a certain number of bits to each block [8, 9, 10, 11]. The ROI scaling-based coding used, scales up the coefficients so that the bits associated with the ROI are placed in higher bit-planes. During the embedded coding process, those bits are placed in the bitstream, before the non ROI parts of the image. Thus, the ROI will be decoded, or refined, before the rest of the image. Regardless of the scaling, a full decoding of the bitstream results in a reconstruction of the whole image with the highest fidelity available. If the bitstream is truncated, of the encoding process is terminated before the whole image is fully encoded, the ROI will have a higher fidelity than the rest of the image. The ROI approach defined in JPEG2000 Part 1 allows ROI encoding of arbitrary shaped regions without the need of shape information and shape decoding [12, 13]. There are four basic dimensions of progression/scalability in JPEG2000 bitstream: resolution, quality, spatial location and component. Different types of progression are achieved by the ordering of packets within the bitstream (Figure 2, 3) [14]. The JPEG2000 bitstream contains markers which identify the progression type of the bitstream. Other markers may be written which store the length of every packet in the bitstream. To change a bitstream from progressive by resolution to progressive by SNR, a parser can read all the markers, change the type of progression in the markers, write the lengths of the packets out in the new order, and write the packets themselves out in the new order. There is no need to run the MQ-coder, the context model, or even decode the block inclusion information. The complexity is only slightly higher than a pure copy operation. 4 Comparative performances and functionalities Compression efficiency is one of the top priorities in the design of image products. Lossless and lossy progressive compression efficiency have been evaluated with 7 images from the JPEG2000 test set, covering various types of imagery. JPEG2000 provides, in most cases, competitive compression ratios with the added benefit of scalability. The rate distortion behavior of the lossy (nonreversible) JPEG2000 and the progressive JPEG is depicted in Figure 4 for a natural image. It is seen that the JPEG2000 significantly outperforms the JPEG scheme [14]. In order to evaluate the error resilience features offered by the different standards a transmission channel with random errors has been simulated in [15], together with the evaluation of the average reconstructed image quality after decompression. Table 3 shows the results for JPEG2000 with nonreversible filter and JPEG baseline. A it can be seen, the reconstructed image quality under transmission errors is higher for JPEG2000 then JPEG, across all encoding bitrates and error rates. Table 1 sumarizes the comparison of still image coding algorithms from a functionality point of view. A functionality matrix indicates the set of supported features in each standard. 5 Concluding remarks JPEG2000 and MPEG-4 VTC are the new standards offering the rich set of features based on multiresolution techniques (Table 1), in an efficient manner and within an integrated algorithmic approach. The most important technology highlights for JPEG 2000 are: wavelet/subband coding, reversible integerto-integer and nonreversible real-to-real wavelet transforms, bit-plane coding, arithmetic coding MQ coder from JBIG2 (ISO/IEC 14492), based heavily on EBCOT (embedded block coding with optimized truncation) coding scheme, code stream syntax similar to JPEG, file format syntax. Overall, one can say that JPEG2000 is successful standard that offers the richest set of features and provides superior rate-distortion performance. However, this comes at the price of additional complexity when compared to JPEG, which might be currently perceived as a disadvantage for some applications, as was the case for JPEG when it was first introduced. References: [1] K.R.Rao, Z.Bojkovic, D.Milovanovic, Multimedia communication systems: techniques, standards and networks (Prentice-Hall, 2002). [2] K.R.Rao, Z.Bojkovic, Packet video communications over ATM networks (PrenticeHall 2000). [3] A.N.Skodras, C.A.Christopoulos, T.Ebrahimi, JPEG2000: The upcoming stil image compression standard, Proc. RECPA, pp.359-366, 2000. [4] ISO/IEC JTC1/SC29/WG11 N750, Performance evaluation of different reversible decorrelating transforms in JPEG2000 baseline system, 1998. [5] M.D.Adams, F.Kossentini, Reversible integer-tointeger wavelet transforms for image compression: Performance evaluation and analysis, IEEE Trans. IP, vol.9, no.6, pp.10101024, 2000. [6] ISO/IEC JTC1/SC29/WG11 N868, Performance evaluation of spatially segmented wavelet transform in the JPEG200 baseline system, 1998. Tiling [7] P.Sriram, M.W. Marcellin, Image coding using wavelet transforms and entropy-constrained trellis quantization, IEEE Trans. IP, vol.4, pp.725-733, 1995. [8] J.M.Shapiro, Embedded image coding using zerotrees of wavelet coefficients, IEEE Trans. SP, vol.41, no.12, pp.3445-3462, 1993. [9] A.Said, W.Pearlman, A new, fast and efficient image codec based on set partitioning in hierarchical trees, IEEE Trans. CSVT, vol.6, no.3, pp.214-223, 1993. [10] D.Taubman, A.Zakhor, Multirate 3D subband coding of video, IEEE Trans. IP, vol.3, no.5, pp.572-588, 1994. [11] D.Taubman, High performance scalable image compression with EBCOT, IEEE Trans. IP, vol.9, no.7, 2000. [12] D.S.Cruz, T.Ebrahimi, M.Larsson, J.Askelöf, C.Cristopoulos, Region of Interest Coding in JPEG2000 for interactive client/server applications, Proc IEEE Workshop on MSP, Denmark, pp.389-394, 1999 [13] ISO/IEC JTC1/SC29/WG11 N892, Region of interest coding, 1998. [14] ISO/IEC JTC1/SC29/WG11 N1716, Report on CoreExperiments V1 (Evaluation of the distortion-adaptive progressive CSF weighting technique), 2000. [15] ISO/IEC JTC1/SC29/WG11 N1606, Error resilience Ad-hoc sub-group report, 2000. DWT on each tile 0 1 2 3 DC level shifting Image Component 4 5 6 7 Figure 1. Tiling, DC level shifting and DWT on each image tile component. Packet Header n0 sub-bitplanes from code-block 0 n1 sub-bitplanes from code-block 1 8 9 Figure 2. Twelve code-blocks of one packet partition location at resolution level 2 of a 3-level dyadic wavelet transform. The packet partition location is presented by heavy lines. n11 sub-bitplanes from code-block 11 Figure 3. The composition of one packet partition location with 12 code-blocks. Table 1. Functionality matrix. A + indicates that it is supported, the more+ the more efficiently or better it is supported. A - indicates that it is not supported. JPEG2000 Lossless compression performance Lossy compression performance Progressive bitstreams JPEG-LS JPEG +++ ++++ + +++++ + +++ - 2 +++++ Region Of Interest (ROI) coding MPEG-4 VTC PNG - +++ ++++ - +++ + 1 ++ 3 - +++ - - + - - - ++ - Random access ++ - _ - - Low complexity ++ +++++ +++++ + +++ Error resilience +++ ++ ++ +++ + Non-iterative rate control +++ - - + - +++ +++ ++ ++ +++ Arbitrary shaped objects 4 Genericity 1 2 Only using the lossless mode of JPEG. Only in the progressive mode of JPEG. Tile-based only. 4 Ability to efficiently compress different types of imagery across a wide range of bitrates. 3 Table 2. Wavelet technologies in JPEG2000 Part 1 and Part 2. Technology Bitstream Part 1 File format Arithmetic coder Coefficient modeling Quantization Transformation Component decorrelation Error resilience Bit-stream ordering Part 2 Fixed and variable length markers. Optional. Provide intellectual property (e.g. copyright) information, color or tone-space for image, general method of including metadata. MQ-coder. Independent coding of fixed size blocks within subbands. Division of coefficients into 3 subbitplanes. Grouping of sub-bitplanes into layers. Scalar quantizer with dead-zone, truncation of code-blocks. Low complexity (5,3) and high performance Daubechies (9,7). Mallat decomposition. New markers can be skipped by a Part 1 decoder. Reversible component transform (RCT), YcrCb transform. Arbitrary point transform or reversible wavelet transform across components. Resynchronization markers. Progressive by tile-part, then SNR, or resolution, or component. Fixed length entropy coder, repeated headers. Figure 4. Rate distortion results for the progressive JPEG2000 vs. the progressive JPEG for a natural image. Allow metadata to be interleaved with coded data. Define types of metadata. Same? Special models for binary or graphic data? Trellis coded quantization. Many more filters, perhaps user-defined filters. Packet and other decompositions. Out of order tile-parts. Table 3. Average PSNR [dB] of the decoded Café image transmitted over noisy channel with various bit error rates (BER) and compression bitrates, for JPEG baseline and JPEG2000 (J2K). 46 bpp 44 PSNR (dB) 42 0.25 40 38 0.5 36 JPEG2000 NR P-JPEG 34 1.0 32 30 0 0.5 1 1.5 Bitrate (bpp) 2 2.5 2.0 J2K JPEG J2K JPEG J2K JPEG J2K JPEG BER 0 BER 1E-06 BER 1E-05 23.06 21.94 26.71 25.40 31.90 30.34 38.91 37.22 23.00 21.79 26.42 25.12 30.75 29.24 36.38 30.68 21.62 20.77 23.96 22.95 27.08 23.65 27.23 20.78
© Copyright 2026 Paperzz