Course Presentation Multimedia Systems Image III (Image Compression, JPEG) Mahdi Amiri April 2011 Sharif University of Technology Image Compression Basics Large amount of data in digital images File size for a 14 Megapixel color image 42 MB in uncompressed RGB 24bit/pixel format ~ 24 images in a 1GB memory card ~1.5 MB in JPEG (90% quality) format ~ 667 images in a 1GB memory card Compression crucial Different number of techniques available RLE, LZ, ADPCM, DCT Choice depends on Type of image (B/W, Grayscale, Color, Content) Application (Entertainment, Medial, Real-time) Page 1 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III Image Compression JPEG Most commonly used still image compression method Image files, cameras, and WWW Original 178 KB Q: 50 37 KB Lossy Compression (inc. a lossless coding mode too) Adjustable degree of compression Q: 5 16 KB Tradeoff between storage size and image quality Typ. Compression ratio: 10:1 (with little perceptible loss in image quality) Supports a max. image size of 65535x65535 Page 2 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III Q: 1 13 KB Image Compression JPEG Acronym for the “Joint Photographic Experts Group” A sub-groups of ISO/IEC http://www.jpeg.org/ The group was organized in 1986 First public release date JPEG part 1 standard, 1992 Page 3 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III Image Compression JPEG Pro: Works well on photographs and paintings of realistic scenes with smooth variations of tone and color. Lossy compression in the typical use is not suitable for certain applications such as medical imaging. Con: Not proper for line drawings and other textual or iconic graphics, where the sharp contrasts between adjacent pixels can cause noticeable artifacts. House Test Image Grass Test Image Page 4 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III Image Compression JPEG Encoder Steps Color space transformation: RGB to YCbCr The representation of the colors in the image is converted from RGB to Y′CBCR, consisting of one luma component (Y'), representing brightness, and two chroma components, (Cb and Cr), representing color. This step is sometimes skipped. Chroma subsampling The resolution of the chroma data is reduced, usually by a factor of 2. This reflects the fact that the eye is less sensitive to fine color details than to fine brightness details. Block splitting and DCT The image is split into blocks of 8×8 pixels. For each block, each of the Y, Cb, and Cr data undergoes a discrete cosine transform (DCT). A DCT is similar to a Fourier transform in the sense that it produces a kind of spatial frequency spectrum. Quantization The amplitudes of the frequency components are quantized. Human vision is much more sensitive to small variations in color or brightness over large areas than to the strength of high-frequency brightness variations. Therefore, the magnitudes of the highfrequency components are stored with a lower accuracy than the low-frequency components. The quality setting of the encoder (for example 50 or 95 on a scale of 0–100 in the Independent JPEG Group's library) affects to what extent the resolution of each frequency component is reduced. If an excessively low quality setting is used, the high-frequency components are discarded altogether. Entropy Coding The resulting data for all 8×8 blocks is further compressed with a lossless algorithm, a variant of Huffman encoding. Page 5 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG Codec Diagram, Scheme 1 Encoder Decoder Page 6 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG Encoder Diagram, Scheme 2 JPEG encoder diagram for a single block of 8 by 8 pixels Page 7 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG Encoder Diagram, Scheme 3 Baseline JPEG Encoder block diagram Page 8 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG Color Space Transformation RGB to YCbCr conversion concept: The human eye is less sensitive to fine color (chrominance) details than to fine brightness (luminance) details. Analog TV Digital TV Cb = B – Y Cr = R - Y Page 9 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG, Chroma Subsampling Subsampling in YCbCr Page 10 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG Block Splitting and DCT Block splitting The image is split into blocks of 8×8 pixels. Later we discuss why this is done. Discrete Cosine Transform (DCT) Each 8×8 block of each component (Y, Cb, Cr) is converted to a frequency-domain representation, using a normalized, two-dimensional type-II discrete cosine transform (DCT). Page 11 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG, DCT Center Around Zero The 8×8 sub-image shown in 8-bit grayscale Page 12 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG, DCT Fourier Coefficients square-wave synthesized using Fourier cosine coefficients and sine coefficients Page 13 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III DCT Basis Functions The DCT transforms an 8×8 block of input values to a linear combination of these 64 patterns. The patterns are referred to as the two-dimensional DCT basis functions, and the output values are referred to as transform coefficients. The horizontal index is u and the vertical index is v. The 8×8 sub-image Page 14 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG, DCT DCT Coefficients DC coefficient ( Top-left corner, has large magnitude ) AC coefficients ( Other 63 coefficients ) DCT aggregates most of the signal in one corner Larger values in the top-left corner DCT coefficient for our sample block (rounded to the nearest two digits beyond the decimal point) Page 15 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG DCT Coefficients, Example The result of taking the DCT. The numbers in red are the coefficients that fall below the specified threshold of 10. Page 16 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG, DCT Histograms of DCT Coefficients Histograms of DCT Coefficients of image ‘lena’ using blocks of 8×8 pixels Page 17 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG, Quantization Concept The human eye is good at seeing small differences in brightness over a relatively large area, but not so good at distinguishing the exact strength of a high frequency brightness variation. Small quantization step for low frequency components (Top-left corner in DCT coefficients matrix ) Big quantization step for high frequency components (Bottom-right corner in DCT coefficients matrix ) DCT coefficient Page 18 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III Sample Images JPEG, Quantization Quantization Matrix A typical quantization matrix, as specified in the original JPEG Standard G is the unquantized DCT coefficients Q is the quantization matrix B is the quantized DCT coefficients Page 19 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG, Quantization Sample Output Quantized DCT coefficient for our sample block Many of the higher frequency components are rounded to zero Page 20 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG, Quantization Page 21 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG, Entropy Coding Zigzag Ordering DC Coefficient: DPCM AC Coefficients Run-length encoding ( RLE ) Then using Huffman coding on the whole sequence of numbers Page 22 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG Encoder Example Page 23 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG Decoder Example Page 24 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG Compression Ratio Original Page 25 JPEG Compressed Quality setting of 50 Difference (Darker means a larger difference) Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG Blocking Artifact Original Page 26 JPEG Compressed Quality setting of 5 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG, Block Splitting Bocks of 8 by 8 Pixels Why Blocking? Padding If the data for a channel does not represent an integer number of blocks then the Neighboring pixels are more correlated encoder must fill the remaining area of the incomplete blocks with some form of dummy data. Lower computational complexity The computational complexity for 2D DCT of an N by N image is: O N 2 log 2 N , while the complexity of 2D DCT of all N/8 by N/8 blocks of image is: N 2 2 2 2 8 O 8 log 2 8 O N What about blocks of 16×16 pixels? Page 27 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG, Block Splitting Larger Blocks Pro: Less blocking artifact Con: Less Correlated data inside the block Higher computational complexity Efficiency as a function of block size N×N, measured for 8 bit quantization in the original domain and equivalent quantization in the transform domain. Block size 8×8 is a good compromise between coding efficiency and complexity Page 28 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III JPEG, Quantization Matrix Quality Factor The quality setting of the encoder (for example 50 or 95 on a scale of 0–100 in the Independent JPEG Group's library) affects to what extent the resolution of each frequency component is reduced. For a quality of 100%, the quantization tables should be setup such that all entries are one. For a quality factor of 50%, the ITU/ISO recommended tables are recommended, but any other choice is also valid. For a quality between 50% and 100%, one may interpolate between the quality factor given for 50%, and that for 100% (i.e. 1.0) Page 29 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III Multimedia Systems Image III (Compression, JPEG) Thank You Next Session: Video I FIND OUT MORE AT... 1. http://ce.sharif.edu/~m_amiri/ 2. http://www.dml.ir/ Page 30 Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
© Copyright 2025 Paperzz