Lecture 05, "Entropy Coding"

Course Presentation
Multimedia Systems
Entropy Coding
Mahdi Amiri
February 2011
Sharif University of Technology
Data Compression
Motivation
Data storage and transmission cost money
Use fewest number of bits to represent information source
Pro:
Less memory, less transmission time
Cons:
Extra processing required
Distortion (if using lossy compression )
Data has to be decompressed to be represented, this
may cause delay
Page 1
Multimedia Systems, Entropy Coding
Data Compression
Lossless and Lossy
Lossless
Exact reconstruction is possible
Applied to general data
Lower compression rates
Examples: Run-length, Huffman, Lempel-Ziv
Lossy
Higher compression rates
Applied to audio, image and video
Examples: CELP, JPEG, MPEG-2
Page 2
Multimedia Systems, Entropy Coding
Lossless Compression
Run-length encoding
BBBBHHDDXXXXKKKKWWZZZZ
Image of a rectangle
Page 3
4B2H2D4X4K2W4Z
0, 40
0, 40
0,10 1,20 0,10
0,10 1,1 0,18 1,1 0,10
0,10 1,1 0,18 1,1 0,10
0,10 1,1 0,18 1,1 0,10
0,10 1,20 0,10
0,40
Multimedia Systems, Entropy Coding
Lossless Compression
Fixed Length Coding (FLC)
A simple example
The message to code:
►♣♣♠☻►♣☼►☻
Message length: 10 symbols
5 different symbols  at least 3 bits
Codeword table
Total bits required to code: 10*3 = 30 bits
Page 4
Multimedia Systems, Entropy Coding
Lossless Compression
Variable Length Coding (VLC)
Intuition: Those symbols that are more frequent should have smaller codes, yet since
their length is not the same, there must be a way of distinguishing each code
The message to code:
►♣♣♠☻►♣☼►☻
Codeword table
To identify end of a codeword as
soon as it arrives, no codeword can
be a prefix of another codeword
How to find the optimal codeword table?
Total bits required to code: 3*2 +3*2+2*2+3+3 = 24 bits
Page 5
Multimedia Systems, Entropy Coding
Lossless Compression
VLC, Example Application
Morse code
nonprefix code
Needs separator symbol
for unique decodability
Page 6
Multimedia Systems, Entropy Coding
Lossless Compression
Huffman Coding Algorithm
Step 1: Take the two least probable symbols in the alphabet
(longest codewords, equal length, differing in last digit)
Step2: Combine these two symbols into a single symbol, and repeat.
P(n): Probability of
symbol number n
Here there is 9 symbols.
e.g. symbols can be
alphabet letters ‘a’, ‘b’, ‘c’,
‘d’, ‘e’, ‘f’, ‘g’, ‘h’, ‘i’
Page 7
Multimedia Systems, Entropy Coding
Lossless Compression
Huffman Coding Algorithm
Paper: "A Method for the Construction of
Minimum-Redundancy Codes“, 1952
Results in "prefix-free codes“
Most efficient
No other mapping will produce a smaller average output size
If the actual symbol frequencies agree with those used to create the code
Cons:
David A. Huffman
1925-1999
Have to run through the entire data in advance to find frequencies
„Minimum-Redundancy‟ is not favorable for error correction techniques (bits
are not predictable if e.g. one is missing)
Does not support block of symbols: Huffman is designed to code single
characters only. Therefore at least one bit is required per character, e.g. a word of
8 characters requires at least an 8 bit code
Page 8
Multimedia Systems, Entropy Coding
Entropy Coding
Entropy, Definition
The entropy, H, of a discrete random variable X is a measure of the
amount of uncertainty associated with the value of X.
Information Theory X  Information Source
Point of View
P(x)  Probability that symbol x in X will occur
1
H  X    P  x   log 2
P  x
xX
Measure of information content (in bits)
A quantitative measure of the disorder of a system
It is impossible to compress the data such that the average
number of bits per symbol is less than the Shannon entropy
of the source(in noiseless channel)
The Intuition Behind the Formula
P  x    amount of uncertatinty   H
1
P  x
Claude E. Shannon
1916-2001
1
 I  x  , information content of x
P  x
weighted average number of bits required to encode each possible value   P  x  and 
bringing it to the world of bits  H
Page 9
log 2
Multimedia Systems, Entropy Coding
Lossless Compression
Lempel-Ziv (LZ77)
Algorithm for compression of character sequences
Assumption: Sequences of characters are repeated
Idea: Replace a character sequence by a reference to an earlier occurrence
1. Define a:
search buffer = (portion) of recently encoded data
look-ahead buffer = not yet encoded data
2. Find the longest match between
the first characters of the look ahead buffer
and an arbitrary character sequence in the search buffer
3. Produces output <offset, length, next_character>
offset + length = reference to earlier occurrence
next_character = the first character following the match in the look ahead buffer
Page 10
Multimedia Systems, Entropy Coding
Lossless Compression
Lempel-Ziv-Welch (LZW)
Drops the search buffer and keeps an explicit dictionary
Produces only output <index>
Used by unix "compress", "GIF", "V24.bis", "TIFF”
Example: wabbapwabbapwabbapwabbapwoopwoopwoo
Progress clip at 12th entry
Encoder output sequence so far: 5 2 3 3 2 1
Page 11
Multimedia Systems, Entropy Coding
Lossless Compression
Lempel-Ziv-Welch (LZW)
Example: wabbapwabbapwabbapwabbapwoopwoopwoo
Progress clip at the end of above example
Encoder output sequence: 5 2 3 3 2 1
6 8 10 12 9 11 7 16 5 4 4 11 21 23 4
Page 12
Multimedia Systems, Entropy Coding
Lossless Compression
Arithmetic Coding
Encodes the block of symbols into a single number, a
fraction n where (0.0 ≤ n < 1.0).
Step 1: Divide interval [0,1) into subintervals based on probability of
the symbols in the current context  Dividing Model.
Step 2: Divide interval corresponds to the current symbol into subintervals based on dividing model of step 1.
Step 3: Repeat Step 2 for all symbols in the block of symbols.
Step 4: Encode the block of symbols with a single number in the
final resulting range. Use the corresponding binary number in this
range with the smallest number of bits.
See the encoding and decoding examples in the following slides
Page 13
Multimedia Systems, Entropy Coding
Lossless Compression
Arithmetic Coding, Encoding
Example: SQUEEZE
Using FLC: 3 bits per symbol  7*3 = 21 bits
P(‘E’) = 3/7
Prob. ‘S’ ‘Q’ ‘U’ ‘Z’: 1/7
Dividing Model
We can encode the word SQUEEZE with a
single number in [0.64769-0.64772) range.
The binary number in this range with the
smallest number of bits is 0.101001011101,
which corresponds to 0.647705 decimal. The '0.'
prefix does not have to be transmitted because
every arithmetic coded message starts with this
prefix. So we only need to transmit the sequence
101001011101, which is only 12 bits.
Page 14
Multimedia Systems, Entropy Coding
Lossless Compression
Arithmetic Coding, Decoding
Input Probabilities: P(„A‟)=60%, P(„B‟)=20%, P(„C‟)=10%, P(„<space>‟)=10%
Decoding the input value of 0.538
60%
Dividing model from input probabilities 
The fraction 0.538 (the circular point) falls into the
sub-interval [0, 0.6)  the first decoded symbol is 'A'
The subregion containing the point is successively
subdivided in the same way as diviging model.
Since .538 is within the interval [0.48, 0.54), the
second symbol of the message must have been 'C'.
Since .538 falls within the interval [0.534, 0.54), the
Third symbol of the message must have been '<space>'.
The internal protocol in this example indicates <space> as the
termination symbol, so we consider this is the end of decoding process
Page 15
Multimedia Systems, Entropy Coding
20%
10% 10%
Lossless Compression
Arithmetic Coding
Pros
Typically has a better compression ratio than
Huffman coding.
Cons
High computational complexity.
Patent situation had a crucial influence to decisions
about the implementation of an arithmetic coding
(Many now are expired).
Page 16
Multimedia Systems, Entropy Coding
Multimedia Systems
Entropy Coding
Thank You
Next Session: Color Space
FIND OUT MORE AT...
1. http://ce.sharif.edu/~m_amiri/
2. http://www.dml.ir/
Page 17
Multimedia Systems, Entropy Coding