Lec09, Image III (Compression, JPEG), v1.04.pdf

Course Presentation
Multimedia Systems
Image III
(Image Compression, JPEG)
Mahdi Amiri
April 2011
Sharif University of Technology
Image Compression
Basics
Large amount of data in digital images
File size for a 14 Megapixel color image
42 MB in uncompressed RGB 24bit/pixel format
~ 24 images in a 1GB memory card
~1.5 MB in JPEG (90% quality) format
~ 667 images in a 1GB memory card
Compression crucial
Different number of techniques available
RLE, LZ, ADPCM, DCT
Choice depends on
Type of image (B/W, Grayscale, Color, Content)
Application (Entertainment, Medial, Real-time)
Page 1
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
Image Compression
JPEG
Most commonly used still image compression
method
Image files, cameras, and WWW
Original
178 KB
Q: 50
37 KB
Lossy Compression
(inc. a lossless coding mode too)
Adjustable degree of compression
Q: 5
16 KB
Tradeoff between storage size and image quality
Typ. Compression ratio: 10:1
(with little perceptible loss in image quality)
Supports a max. image size of 65535x65535
Page 2
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
Q: 1
13 KB
Image Compression
JPEG
Acronym for the
“Joint Photographic Experts Group”
A sub-groups of ISO/IEC
http://www.jpeg.org/
The group was organized in 1986
First public release date
JPEG part 1 standard, 1992
Page 3
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
Image Compression
JPEG
Pro:
Works well on photographs and paintings of
realistic scenes with smooth variations of tone
and color.
Lossy compression in the typical use  is not
suitable for certain applications such as medical
imaging.
Con:
Not proper for line drawings and other textual
or iconic graphics, where the sharp contrasts
between adjacent pixels can cause noticeable
artifacts.
House Test Image
Grass Test Image
Page 4
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
Image Compression
JPEG Encoder Steps
Color space transformation: RGB to YCbCr
The representation of the colors in the image is converted from RGB to Y′CBCR, consisting of one luma component (Y'),
representing brightness, and two chroma components, (Cb and Cr), representing color. This step is sometimes skipped.
Chroma subsampling
The resolution of the chroma data is reduced, usually by a factor of 2. This reflects the fact that the eye is less sensitive to fine color
details than to fine brightness details.
Block splitting and DCT
The image is split into blocks of 8×8 pixels. For each block, each of the Y, Cb, and Cr data undergoes a discrete cosine transform
(DCT). A DCT is similar to a Fourier transform in the sense that it produces a kind of spatial frequency spectrum.
Quantization
The amplitudes of the frequency components are quantized. Human vision is much more sensitive to small variations in color or
brightness over large areas than to the strength of high-frequency brightness variations. Therefore, the magnitudes of the highfrequency components are stored with a lower accuracy than the low-frequency components. The quality setting of the encoder (for
example 50 or 95 on a scale of 0–100 in the Independent JPEG Group's library) affects to what extent the resolution of each
frequency component is reduced. If an excessively low quality setting is used, the high-frequency components are discarded
altogether.
Entropy Coding
The resulting data for all 8×8 blocks is further compressed with a lossless algorithm, a variant of Huffman encoding.
Page 5
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG
Codec Diagram, Scheme 1
Encoder
Decoder
Page 6
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG
Encoder Diagram, Scheme 2
JPEG encoder diagram for a single block of 8 by 8 pixels
Page 7
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG
Encoder Diagram, Scheme 3
Baseline JPEG
Encoder
block diagram
Page 8
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG
Color Space Transformation
RGB to YCbCr conversion concept:
The human eye is less sensitive to fine color (chrominance)
details than to fine brightness (luminance) details.
Analog TV
Digital TV
Cb = B – Y
Cr = R - Y
Page 9
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG, Chroma Subsampling
Subsampling in YCbCr
Page 10
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG
Block Splitting and DCT
Block splitting
The image is split into blocks of 8×8 pixels.
Later we discuss why this is done.
Discrete Cosine Transform (DCT)
Each 8×8 block of each component (Y, Cb, Cr) is
converted to a frequency-domain representation, using
a normalized, two-dimensional type-II discrete cosine
transform (DCT).
Page 11
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG, DCT
Center Around Zero
The 8×8 sub-image shown
in 8-bit grayscale
Page 12
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG, DCT
Fourier Coefficients
square-wave synthesized using Fourier cosine coefficients and sine coefficients
Page 13
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
DCT
Basis Functions
The DCT transforms an 8×8 block of
input values to a linear combination
of these 64 patterns. The patterns are
referred to as the two-dimensional
DCT basis functions, and the output
values are referred to as transform
coefficients. The horizontal index is u
and the vertical index is v.
The 8×8
sub-image
Page 14
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG, DCT
DCT Coefficients
DC coefficient ( Top-left corner, has large magnitude )
AC coefficients ( Other 63 coefficients )
DCT aggregates most of the signal in one corner
Larger values in the top-left corner
DCT coefficient for our sample block (rounded to the nearest two digits beyond the decimal point)
Page 15
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG
DCT Coefficients, Example
The result of taking the DCT. The numbers in red are the
coefficients that fall below the specified threshold of 10.
Page 16
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG, DCT
Histograms of DCT Coefficients
Histograms of DCT
Coefficients of image
‘lena’ using blocks of
8×8 pixels
Page 17
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG, Quantization
Concept
The human eye is good at seeing small
differences in brightness over a relatively large
area, but not so good at distinguishing the exact
strength of a high frequency brightness variation.
Small quantization step for low frequency
components (Top-left corner in DCT
coefficients matrix )
Big quantization step for high frequency
components (Bottom-right corner in DCT
coefficients matrix )
DCT coefficient
Page 18
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
Sample Images
JPEG, Quantization
Quantization Matrix
A typical quantization matrix, as specified in the original
JPEG Standard
G is the unquantized DCT coefficients
Q is the quantization matrix
B is the quantized DCT coefficients
Page 19
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG, Quantization
Sample Output
Quantized DCT coefficient for our sample block
Many of the higher frequency components are rounded
to zero
Page 20
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG, Quantization
Page 21
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG, Entropy Coding
Zigzag Ordering
DC Coefficient: DPCM
AC Coefficients
Run-length encoding ( RLE )
Then using Huffman coding
on the whole sequence of numbers
Page 22
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG
Encoder Example
Page 23
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG
Decoder Example
Page 24
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG
Compression Ratio
Original
Page 25
JPEG Compressed
Quality setting of 50
Difference
(Darker means a larger
difference)
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG
Blocking Artifact
Original
Page 26
JPEG Compressed
Quality setting of 5
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG, Block Splitting
Bocks of 8 by 8 Pixels
Why Blocking?
Padding
If the data for a channel does not represent
an integer number of blocks then the
Neighboring pixels are more correlated encoder
must fill the remaining area of the
incomplete blocks with some form of
dummy data.
Lower computational complexity
The computational complexity for 2D DCT of an
N by N image is: O  N 2 log 2 N 
, while the complexity of 2D DCT of all N/8 by
N/8 blocks of image is: N 2
2
2
2
8
O  8 log 2 8  O  N
What about blocks of 16×16 pixels?
Page 27
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III

JPEG, Block Splitting
Larger Blocks
Pro: Less blocking artifact
Con:
Less Correlated data inside the block
Higher computational complexity
Efficiency as a function of block size
N×N, measured for 8 bit quantization
in the original domain and equivalent
quantization in the transform domain.
Block size 8×8 is a good
compromise between coding
efficiency and complexity
Page 28
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
JPEG, Quantization Matrix
Quality Factor
The quality setting of the encoder (for example 50 or 95
on a scale of 0–100 in the Independent JPEG Group's
library) affects to what extent the resolution of each
frequency component is reduced.
For a quality of 100%, the quantization tables should be
setup such that all entries are one. For a quality factor of
50%, the ITU/ISO recommended tables are recommended,
but any other choice is also valid. For a quality between
50% and 100%, one may interpolate between the quality
factor given for 50%, and that for 100% (i.e. 1.0)
Page 29
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III
Multimedia Systems
Image III (Compression, JPEG)
Thank You
Next Session: Video I
FIND OUT MORE AT...
1. http://ce.sharif.edu/~m_amiri/
2. http://www.dml.ir/
Page 30
Multimedia Systems, Spring 2011, Mahdi Amiri, Image III