ELEC 515 Information Theory Distortionless Source Coding 1 Source Coding Output Alphabet Y={y1,…,yJ} Source Encoder Lengths 2 Typical Sequences • Consider the following discrete memoryless binary source: p(1) = 1/4 p(0) = 3/4 • Sequences of 20 symbols 1. 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 2. 1,0,1,0,1,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1 3. 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 3 Tchebycheff Inequality 4 Weak Law of Large Numbers • Sequence of N i.i.d. RVs • Define a new RV 5 Weak Law of Large Numbers The average approaches the statistical mean 6 Asymptotic Equipartition Property • N i.i.d. random variables X1, …, XN p(X1 ,X2 ,…,X N )=p(X1 )p(X2 )…p(X N ) 1 1 N − logp(X1 ,X2 ,…,X N ) = − ∑ logp(X n ) → −E[logp(X)]=H(X) N N n=1 as N → ∞ 7 Typical Sequences • RV X where p(xk) = pk • Consider a sequence x where xk appears Npk times Np Np Np p(x) = p1 1 p2 2 pK K K K = ∏ pkNpk = ∏ ((2log2 pk )Npk k 1= k 1 = K = ∏ 2Npk log2 pk k =1 K ∑ pk log2 pk = 2 k=1 N = 2− NH(X) 8 Summary • The Tchebycheff inequality was used to prove the weak law of large numbers (WLLN) – the sample average approaches the statistical mean as N → ∞ • The WLLN was used to prove the AEP 1 N − ∑ logp(X n ) →H(X) as N → ∞ N n=1 • A typical sequence has probability p(x1 , x2 , , xN ) ≈ 2-NH(X) • There are about 2NH(X) typical sequences of length N 9 Typical Sequences H (X ) H (X ) 10 Typical Sequences set of typical sequences nontypical or atypical sequence set of sequences of length N 11 Interpretation • Although there are very many results that may be produced by a random process, the one actually produced is most probably from a set of outcomes that all have approximately the same chance of being the one actually realized. • Although there are individual outcomes which may have a higher probability than outcomes in this set, the vast number of outcomes in the set almost guarantees that the outcome will come from the set. • ``Almost all events are almost equally surprising’’ Cover and Thomas 12 Typical Sequences • From the definition, the probability of occurrence of a typical sequence p(x) is 13 Example • • • • • • • p(x1) = 1/4 p(x2) = 3/4 H(X) = 0.811 bit N=3 p(x1,x1,x1) = 1/64 p(x1,x1,x2) = p(x1,x2,x1) = p(x2,x1,x1) = 3/64 p(x1,x2,x2) = p(x2,x2,x1) = p(x2,x1,x2) = 9/64 (x2,x2,x2) = 27/64 14 Example • If δ = 0.2 the typical sequences are – (x1,x2,x2), (x2,x1,x2), (x2,x2,x1) with probability 0.422 (1,0,0), (0,1,0), (0,0,1) • If δ = 0.4 the typical sequences are – (x1,x2,x2), (x2,x1,x2), (x2,x2,x1), (x2,x2,x2) with probability 0.844 (1,0,0), (0,1,0), (0,0,1), (0,0,0) 15 16 Typical Sequences • • • • • Random variable X Alphabet size K Entropy H(X) Arbitrary number δ>0 Sequences x of blocklength N≥N0 and probability p(x) c N • X (∂ ) + X (∂ ) =K 17 Shannon-McMillan Theorem 18 • The essence of source coding or data compression is that as N → ∞, atypical sequences almost never appear as the output of the source. • Therefore, one can focus on representing typical sequences with codewords and ignore atypical sequences. • Since there are only 2NH(X) typical sequences of length N, and they are approximately equiprobable, it takes about NH(X) bits to represent them. • On average it takes H(X) bits per source output symbol. 19 Variable Length Codes Output Alphabet Y={y1,…,yJ} Lengths 20 21 Variable Length Codes • Prefix code (also prefix-free or instantaneous) C1={0,10,110,111} • Uniquely decodable code (which is not prefix) C2={0,01,011,0111} • Non-singular code (which is not uniquely decodable) C3={0,1,00,11} • Singular code C4={0,10,11,10} 22 Instantaneous Codes • Definition: A uniquely decodable code is said to be instantaneous if it is possible to decode each codeword in a sequence without reference to succeeding codewords. A necessary and sufficient condition for a code to be instantaneous is that no codeword is a prefix of some other codeword. 23 24 Average Codeword Length K L(C) = ∑ p(xk )lk k =1 25 Two Binary Prefix Codes • Five source symbols: x1, x2, x3, x4, x5 • K = 5, J = 2 • c1 = 0, c2 = 10, c3 = 110, c4 = 1110, c5 = 1111 – codeword lengths 1,2,3,4,4 • c1 = 00, c2 = 01, c3 = 10, c4 = 110, c5 = 111 – codeword lengths 2,2,2,3,3 26 Kraft Inequality for Prefix Codes 27 Code Tree 28 Five Binary Codes Source symbols Code A Code B Code C Code D Code E x1 x2 x3 x4 29 Ternary Code Example • Ten source symbols: x1, x2, …, x9, x10 • K = 10, J = 3 • lk = 1,2,2,2,2,2,3,3,3,3 • lk = 1,2,2,2,2,2,3,3,3 • lk = 1,2,2,2,2,2,3,3,4,4 30 Variable Length Codes Output Alphabet Y={y1,…,yJ} Lengths 31 Average Codeword Length Bound H( X ) L(C) ≥ log b J 32 Four Symbol Source Information Source H • p(x1) = 1/2 p(x3) = 1/4 p(x2) = p(x4) = 1/8 • H(X) = 1.75 bits x1 x2 x3 x4 0 110 10 111 L(C) = 1.75 bits x1 x2 x3 x4 00 01 10 11 L(C) = 2 bits 33 Code Efficiency H(X) ζ= L(C)log b J • First code ζ = 1.75/1.75 = 100% • Second code ζ = 1.75/2.0 = 87.5% 34 Compact Codes • A code C is called compact for a source X if its average codeword length L(C) is less than or equal to the average length of all other uniquely decodable codes for the same source and alphabet J. 35 Upper and Lower Bounds H( X ) H( X ) ≤ L(C ) < +1 log b J log b J 36 The Shannon Algorithm • Order the symbols from largest to smallest probability • Choose the codeword lengths according to lk = − log J p(xk ) • Construct the codewords according to the cumulative probability Pk k −1 Pk = ∑ p(xi ) i =1 37 Example • • • • • K = 10, J = 2 p(x1) = p(x2) = 1/4 p(x3) = p(x4) = 1/8 p(x5) = p(x6) = 1/16 p(x7) = p(x8) = p(x9) = p(x10) = 1/32 38 Pk 39 Shannon Algorithm • p(x1) = .4 p(x2) = .3 p(x3) = .2 p(x4) = .1 • H(X) = 1.85 bits Shannon Code x1 00 x2 01 x3 101 x4 1110 Alternate Code x1 0 x2 10 x3 110 x4 111 L(C) = 2.4 bits L(C) = 1.9 bits ζ = 77.1% ζ = 97.4% 40 Shannon’s Noiseless Coding Theorem 41 Robert M. Fano (1917-2016) 42 The Fano Algorithm • Arrange the symbols in order of decreasing probability • Divide the symbols into J approximately equally probable groups • Each group receives one of the J code symbols as the first symbol • This division process is repeated within the groups as many times as possible 43 David Huffman (1925-1999) 44 • ``It was the most singular moment in my life. There was the absolute lightning of sudden realization.’’ – David Huffman • ``Is that all there is to it!’’ – Robert Fano 45 The Binary Huffman Algorithm 1. Arrange the symbols in order of decreasing probability. 2. Assign a 1 to the last digit of the Kth codeword ck and a 0 to the last digit of the (K-1)th codeword ck-1. Note that this assignment is arbitrary. 3. Form a new source X´ with x´k = xk, k = 1, …, K-2, and p(x´k-1) = p(xk-1) + p(xk) x´k-1 = xk-1 U xk 4. Rearrange the new set of probabilities, set K = K-1. 5. Repeat Steps 2 to 4 until all symbols have been combined. To obtain the codewords, trace back to the original symbols. 46 Five Symbol Source • p(x1)=.35 p(x2)=.22 p(x3)=.18 p(x4)=.15 p(x5)=.10 • H(X) = 2.2 bits 47 48 Huffman Code for the English Alphabet 49 Six Symbol Source • p(x1)=.4 p(x2)=.3 p(x3)=.1 p(x4)=.1 p(x5)=.06 p(x6)=.04 • H(X) = 2.1435 bits 50 Second Five Symbol Source • p(x1)=.4 p(x2)=.2 p(x3)=.2 p(x4)=.1 p(x5)=.1 • H(X) = 2.1219 bits 51 Two Huffman Codes x1 x1 x2 x2 x3 x3 x4 x4 x5 x5 C1 C2 52 Two Huffman Codes x1 x2 x3 x4 x5 C1 0 10 111 1101 1100 C2 11 01 00 101 100 53 Second Five Symbol Source • p(x1)=.4 p(x2)=.2 p(x3)=.2 p(x4)=.1 p(x5)=.1 • H(X) = 2.1219 bits L(C) = 2.2 bits • variance of code C1 0.4(1-2.2)2+0.2(2-2.2)2+0.2(3-2.2)2+0.2(4-2.2)2 = 1.36 • variance of code C2 0.8(2-2.2)2+0.2(3-2.2)2 = 0.16 • which code is preferable? 54 Nonbinary Codes • The Huffman algorithm for nonbinary codes (J>2) follows the same procedure as for binary codes except that J symbols are combined at each stage. • This requires that the number of symbols in the source X is K’=J+c(J-1), K’≥K 55 Nonbinary Example • J=3 K=6 • Require K’=J+c(J-1) → c=2 so K’=7 • Add an extra symbol x7 with p(x7)=0 • p(x1)=1/3 p(x2)=1/6 p(x3)=1/6 p(x4)=1/9 p(x5)=1/9 p(x6)=1/9 p(x7)=0 • H(X) = 1.54 trits 56 Codes for Different Output Alphabets • K=13 • p(x1)=1/4 p(x2)=1/4 p(x3)=1/16 p(x4)=1/16 p(x5)=1/16 p(x6)=1/16 p(x7)=1/16 p(x8)=1/16 p(x9)=1/16 p(x10)=1/64 p(x11)=1/64 p(x12)=1/64 p(x13)=1/64 • J=2 to 13 57 Codes for Different Output Alphabets J p(xi) xi x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 L(C) 58 ζ Code Efficiency J 59 The Binary and Quaternary Codes • • • • • • • • • • • • • x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 00 01 1000 1001 1010 1011 1100 1101 1110 111100 111101 111110 111111 0 1 20 21 22 23 30 31 32 330 331 332 333 60 Huffman Codes • Symbol probabilities must be known a priori • The redundancy of the code L(C)-H(X) is typically nonzero • Error propagation can occur • Codewords have variable length 61 Fixed Length Source Compaction Codes Lengths l1=l2=…=lK=L 62 Fixed Length Source Compaction Codes • If JL < KN we cannot uniquely encode all source words with length L codewords • Two questions 1. How small can JL be such that performance is acceptable? 2. Which sourcewords should be encoded uniquely with length L codewords? 63 The number of typical sequences satisfies TX (δ ) < b N[H(X)+δ ] so encoding all typical sequences with length L codewords requires that L J ≥b N[H(X)+δ ] 64 • Although the set of atypical sequences may be large, the Shannon-McMillan Theorem ensures that • Thus it is possible to encode sourcewords with an arbitrarily small probability of error Pe provided that – LlogbJ > NH(X) – N is sufficiently large 65 Example • K=J=2 • p(x1)=0.1 p(x1)=0.1 H(X) = 0.469 bit • Choose N=4, L=3 L 3 R= = > H(X) N 4 • Partition the 16 sourcewords into 7 typical sequences and 9 atypical sequences 66 67 The Code Typical Sequence x2x2x2x2 x1x2x2x2 x2x1x2x2 x2x2x1x2 x2x2x2x1 x1x1x2x2 x1x2x1x2 Codeword 000 100 010 001 110 101 011 68 The Code Atypical Sequence x1x2x2x1 x2x1x1x2 x2x1x2x1 x2x2x1x1 x1x1x1x2 x1x1x2x1 x1x2x1x1 x2x1x1x1 x1x1x1x1 Codeword 111 0000 111 1000 111 0100 111 0010 111 0001 111 1100 111 1010 111 1001 111 0110 69 Code Rate • The actual code rate is .9639 × 3 + .0361 × 7 3 .7861 R= =+ .0361 = 4 4 70 71 72 73 74 Fixed Length Source Compaction Codes • If R>H(X), as N →∞ Pe→0 • If R<H(X), as N →∞ Pe→1 75 Variable to Fixed Length Codes M source words M≤JL Lengths m1,m2,…,mM Variable to fixed length encoder 76 Variable to Fixed Length Codes • Two questions: 1. What is the best mapping from sourcewords to codewords? 2. How to ensure unique encodability? 77 Average Bit Rate average codeword length ABR = average sourceword length L = L(S) = L(S) E= [XL ] M ∑p(s )m i =1 i i M - number of sourcewords si - sourceword i mi - length of sourceword i 78 Variable to Fixed Length Codes • Design criteria: minimize the Average Bit Rate L ABR= L(S) • ABR ≥ H(X) (vs. L(C) ≥ H(X) for fixed to variable length codes) • L(S) should be as large as possible so that the ABR is close to H(X) 79 Variable to Fixed Length Codes • Fixed to variable length codes H(X) ζ= L(C) • Variable to fixed length codes H(X) ζ= ABR 80 Binary Tunstall Code K=3, L=3 Let x1 = a, x2 = b and x3 = c Unused codeword is 111 81 Binary Tunstall Code Construction • Source X with K symbols • Choose a codeword length L where 2L > K 1. Form a tree with a root and K branches labelled with the symbols 2. If the number of leaves is greater than 2L - (K-1), go to Step 4 3. Find the leaf with the highest probability and extend it to have K branches, go to Step 2 4. Assign codewords to the leaves 82 K=3, L=3 p(a) = .7, p(b) = .2, p(c) = .1 83 H(X) ζ = H(X)/ABR = 84.7% 84 Tunstall Code for a Binary Source • N = 3, K = 2, J = 2, p(x1) = 0.7, p(x2) = 0.3 • JN = 8 Seven sourcewords Eight sourcewords x1x1x1x1x1 x1x1x1x1x1 x1x1x1x1x2 x1x1x1x1x2 x1x1x1x2 x1x1x1x2 x1x1x2 x1x1x2 x1x2x1 x1x2 x1x2x2 x2x1 x2x1 x2x2 x2x2 Codewords 000 001 010 011 100 101 110 111 85 Huffman Code for a Binary Source • • • • • • • • • • N = 3, K = 2, p(x1) = 0.7, p(x2) = 0.3 Eight sourcewords A = x1x1x1 p(A) = .343 00 B = x1x1x2 p(B) = .147 11 C = x1x2x1 p(C) = .147 010 D = x2x1x1 p(D) = .147 011 E = x2x2x1 p(E) = .063 1000 F = x2x1x2 p(F) = .063 1001 G = x1x2x2 p(G) = .063 1010 H = x2x2x2 p(H) = .027 1011 86 Code Comparison • H(X) = .8813 • Tunstall Code (7 codewords) ABR = .9762 ζ = 90.3% • Tunstall Code (8 codewords) ABR = .9138 ζ = 96.4% • Huffman Code L(C) = .9087 ζ = 97.0% 87 Error Propagation • Received Huffman codeword sequence 00 11 00 11 00 11 … A B A B A B • Sequence with one bit error 011 1001 1001 1 … D F F 88 Huffman Coding • The length of Huffman codewords has to be an integer number of symbols, while the selfinformation of the source symbols is almost always a non-integer. • Thus the theoretical minimum message compression cannot always be achieved. • For a binary source with p(x1) = .1 and p(x2) = .9 – An x1 symbol should be encoded to 3.32 bits and an x2 symbol to .152 bits – H(X) = .469 so the optimal average codeword length is .469 bits. 89 Improving Huffman Coding • One way to overcome the redundancy limitation is to encode blocks of several symbols. In this way the per-symbol inefficiency is spread over an entire block. – N = 1: ζ = 46.9% N = 2: ζ = 72.7% • However, using blocks is difficult to implement as there is a block for every possible combination of symbols, so the number of blocks increases exponentially with their length. 90 Peter Elias (1923 – 2001) 91 Arithmetic Coding • Arithmetic coding bypasses the idea of replacing an input symbol (or groups of symbols) with a specific codeword. • Instead, a stream of input symbols is replaced with a single floating-point number in [0,1). • Useful when dealing with sources with small alphabets, such as binary sources, and alphabets with highly skewed probabilities. 92 Arithmetic Coding Applications • JPEG, MPEG-1, MPEG-2 – Huffman and arithmetic coding • JPEG2000, MPEG-4 – Arithmetic coding only • ZIP – prediction by partial matching (PPMd) algorithm • H.263, H.264 93 Arithmetic Coding • The output of an arithmetic encoder is a stream of bits • However we can think that there is a prefix 0, and the stream represents a fractional binary number between 0 and 1 01101010 → 0.01101010 • To explain the algorithm, numbers will be shown as decimal, but they are always binary in practice 94 Example 1 • Encode string bccb from the alphabet {a,b,c} • p(a) = p(b) = p(c) = 1/3 • The arithmetic coder maintains two numbers, low and high, which represent a subinterval [low,high) of the range [0,1) • Initially low=0 and high=1 95 Example 1 • The range between low and high is divided between the symbols of the source alphabet according to their probabilities high 1 c p(c)=1/3 0.6667 p(b)=1/3 p(a)=1/3 b 0.3333 a low 0 96 Example 1 b high 1 high = 0.6667 c 0.6667 b 0.3333 a low 0 low = 0.3333 97 Example 1 high c 0.6667 high = 0.6667 c p(c)=1/3 0.5556 b p(b)=1/3 0.4444 a p(a)=1/3 low 0.3333 low = 0.5556 98 Example 1 high p(c)=1/3 c 0.6667 high = 0.6667 c 0.6296 b p(b)=1/3 0.5926 a p(a)=1/3 low 0.5556 low = 0.6296 99 Example 1 high p(c)=1/3 b 0.6667 high = 0.6543 c 0.6543 b p(b)=1/3 0.6420 a p(a)=1/3 low 0.6296 low = 0.6420 100 Arithmetic Coding Algorithm Set low to 0.0 Set high to 1.0 While there are still input symbols Do get next input symbol range = high – low high = low + range × symbol_high_range low = low + range × symbol_low_range End While output number between high and low 101 Arithmetic Coding Example 2 • • • • p(x1) = 0.5, p(x2) = 0.3, p(x3) = 0.2 Symbol ranges: 0 ≤ x1 < .5 .5 ≤ x2 < .8 .8 ≤ x3 < 1 low = 0.0 high = 1.0 Symbol sequence x1x2x3x2… • Iteration 1 x1: range = 1.0 - 0.0 = 1.0 high = 0.0 + 1.0 × 0.5 = 0.5 low = 0.0 + 1.0 × 0.0 = 0.0 Iteration 2 x2: range = 0.5 - 0.0 = 0.5 high = 0.0 + 0.5 × 0.8 = 0.40 low = 0.0 + 0.5 × 0.5 = 0.25 Iteration 3 x3: range = 0.4 - 0.25 = 0.15 high = 0.25 + 0.15 × 1.0 = 0.40 low = 0.25 + 0.15 × 0.8 = 0.37 • • 102 Arithmetic Coding Example 2 • • Iteration 3 x3: range = 0.4 - 0.25 = 0.15 high = 0.25 + 0.15 × 1.0 = 0.40 low = 0.25 + 0.15 × 0.8 = 0.37 Iteration 4 x2: range = 0.4 - 0.37 = .03 high = 0.37 + 0.03 × 0.5 = 0.385 low = 0.37 + 0.03 × 0.8 = 0.394 • 0.385≤ x1x2x3x2<0.394 0.385 = 0.0110001… 0.394 = 0.0110010... • The first 5 bits of the codeword are 01100 • If there are no additional symbols to be encoded the codeword is 011001 103 Arithmetic Coding Example 3 Suppose that we want to encode the message BILL GATES Character Probability SPACE 1/10 A 1/10 B 1/10 E 1/10 G 1/10 I 1/10 L 2/10 S 1/10 T 1/10 Range 0.00 ≤ r < 0.10 0.10 ≤ r < 0.20 0.20 ≤ r < 0.30 0.30 ≤ r < 0.40 0.40 ≤ r < 0.50 0.50 ≤ r < 0.60 0.60 ≤ r < 0.80 0.80 ≤ r < 0.90 0.90 ≤ r < 1.00 104 1.0 0.9 0.3 0.26 0.258 0.2576 0.25724 0.2572168 0.257216776 0.2572168 0.25722 T T T T T T T T T S S S S S S S S S 0.8 T 0.2572167756 S 0.2572167752 L L L L L L L L L L I I I I I I I I I I G G G G G G G G G G E E E E E E E E E E B B B B B B B B B B A A A A A A A A A A () () () () () () () () () () 0.256 0.2572 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.2 0.25 0.25720 0.257216 0.25721676 105 0.2572164 0.257216772 Arithmetic Coding Example 3 New Symbol Low Value High Value 0.0 1.0 B 0.2 0.3 I 0.25 0.26 L 0.256 0.258 L 0.2572 0.2576 SPACE 0.25720 0.25724 G 0.257216 0.257220 A 0.2572164 0.2572168 T 0.25721676 0.2572168 E 0.257216772 0.257216776 S 0.2572167752 0.2572167756 106 Binary Codeword • 0.2572167752 in binary is 0.01000001110110001111010101100101 • 0.2572167756 in binary is 0.01000001110110001111010101100111 • The transmitted binary codeword is then 0100000111011000111101010110011 • 31 bits long 107 Decoding Algorithm get encoded number Do find symbol whose range contains the encoded number output the symbol subtract symbol_low_range from the encoded number divide by the probability of the output symbol Until no more symbols 108 Decoding BILL GATES Encoded Number Output Symbol Low High Probability 0.2572167752 B 0.2 0.3 0.1 0.572167752 I 0.5 0.6 0.1 0.72167752 L 0.6 0.8 0.2 0.6083876 L 0.6 0.8 0.2 0.041938 SPACE 0.0 0.1 0.1 0.41938 G 0.4 0.5 0.1 0.1938 A 0.2 0.3 0.1 0.938 T 0.9 1.0 0.1 0.38 E 0.3 0.4 0.1 0.8 S 0.8 0.9 0.1 0.0 109 Finite Precision Symbol Probability (fraction) Interval (8-bit precision) fraction Interval (8-bit precision) binary Range in binary a 1/3 [0,85/256) [0.00000000, 0.01010101) 00000000 01010100 b 1/3 [85/256,171/256) [0.01010101, 0.10101011) 01010101 10101010 c 1/3 [171/256,1) [0.10101011, 1.00000000) 10101011 11111111 110 Renormalization Symbol Probability (fraction) a 1/3 00000000 01010100 0 00000000 10101001 b 1/3 01010101 10101010 none 01010101 10101010 c 1/3 10101011 11111111 1 01010110 11111111 Range Digits that can be output Range after renormalization 111 Terminating Symbol Probability Symbol (fraction) Interval (8-bit precision) fraction Interval (8-bit precision) binary Range in binary a 1/3 [0,85/256) [0.00000000, 0.01010101) 00000000 01010100 b 1/3 [85/256,170/256) [0.01010101, 0.10101011) 01010101 10101001 c 1/3 [170/256,255/256) [0.10101011, 0.11111111) 10101010 11111110 term 1/256 [255/256,1) [0.11111111, 1.00000000) 11111111 112 Huffman vs Arithmetic Codes • X = {a,b,c,d} • p(a) = .5, p(b) = .25, p(c) =.125, p(d) = .125 • Huffman code a 0 b 10 c 110 d 111 113 Huffman vs Arithmetic Codes • X = {a,b,c,d} • p(a) = .5, p(b) = .25, p(c) =.125, p(d) = .125 • Arithmetic code ranges a [0, .5) b [.5, .75) c [.75, .875) d [.875, 1) 114 Huffman vs Arithmetic Codes • X = {a,b,c,d} • p(a) = .7, p(b) = .12, p(c) =.10, p(d) = .08 • Huffman code a 0 b 10 c 110 d 111 115 Huffman vs Arithmetic Codes • X = {a,b,c,d} • p(a) = .7, p(b) = .12, p(c) =.10, p(d) = .08 • Arithmetic code ranges a [0, .7) b [.7, .82) c [.82, .92) d [.92, 1) 116 Robustness of Huffman Coding K pi = p(xi ) (actual) pˆ=i pi + ε i (estimated) ∑p =1 i i =1 K ∑ pˆ = 1 i =1 i K ∴∑ε i = 0 i =1 117 Robustness of Huffman Coding K K L(Cˆ) = ∑ pˆi lˆi L(C ) = ∑ pi li i =1 ∆= L L(Cˆ) − L(C=) i =1 K K ∑ pˆ lˆ − ∑ p l i i =i 1 =i 1 = K ( i i ) K ˆ ˆ p l − l + l ε ∑ i i i ∑ ii =i 1 =i 1 118 Robustness of Huffman Coding K ∆= L L(Cˆ) − L(C ) ≈ ∑ ε i lˆi i =1 • Let the variance of ε be σ2, then in the worst case 2 K 2 K ( ∆L ) ≈ K ∑ lˆi − ∑ lˆi = i 1 i 1= 2 2 σ 119 Gadsby by Ernest Vincent Wright If youth, throughout all history, had had a champion to stand up for it; to show a doubting world that a child can think; and, possibly, do it practically; you wouldn’t constantly run across folks today who claim that “a child don’t know anything.” A child’s brain starts functioning at birth; and has, amongst its many infant convolutions, thousands of dormant atoms, into which God has put a mystic possibility for noticing an adult’s act, and figuring out its purport. Up to about its primary school days a child thinks, naturally, only of play. But many a form of play contains disciplinary factors. “You can’t do this,” or “that puts you out,” shows a child that it must think, practically or fail. Now, if, throughout childhood, a brain has no opposition, it is plain that it will attain a position of “status quo,” as with our ordinary animals. Man knows not why a cow, dog or lion was not born with a brain on a par with ours; why such animals cannot add, subtract, or obtain from books and schooling, that paramount position which Man holds today. 120 Lossless Compression Techniques 1 Model and code The source is modelled as a random process. The probabilities (or statistics) are given or acquired. 2 Dictionary-based There is no explicit model and no explicit statistics gathering. Instead, a codebook (or dictionary) is used to map sourcewords into codewords. 121 Model and Code • • • • • Huffman code Tunstall code Fano code Shannon code Arithmetic code 122 Dictionary-based Techniques • Lempel-Ziv – LZ77 – sliding window – LZ78 – explicit dictionary • Adaptive Huffman coding • Due to patents, LZ77 and LZ78 led to many variants: LZ77 Variants LZR LZ78 Variants LZW LZSS DEFLATE LZC LZT LZH LZMW LZJ LZFG • Zip methods use LZH and LZR among other techiques • UNIX compress uses LZC (a variant of LZW) 123 Lempel-Ziv Compression • Source symbol sequences are replaced by codewords that are dynamically determined. • The code table is encoded into the compressed data so it can be reconstructed during decoding. 124 Lempel-Ziv Example 125 126 127 Lempel-Ziv Codeword 128 Compression Comparison Compression as a percentage of the original file size File Type UNIX Compact Adaptive Huffman UNIX Compress Lempel-Ziv-Welch ASCII File 66% 44% Speech File 65% 64% Image File 94% 88% 129 Compression Comparison 130
© Copyright 2024 Paperzz