12: Transform Coding
CSCI 6990 Data Compression
UNIVERSITY of NEW ORLEANS
DEPARTMENT OF COMPUTER SCIENCE
CSCI 6990.002: Data Compression
12: Transform Coding
Vassil Roussev
<vassil @ cs.uno.edu>
y=
2.
5x
Motivating Example
2
Vassil Roussev
1
12: Transform Coding
CSCI 6990 Data Compression
Motivating Example: Rotation
Consider the (reversible) rotation:
θ = Ax
⎡θ ⎤
⎡x ⎤
⎡ cos φ sin φ ⎤
θ = ⎢ 0 ⎥, x = ⎢ 0 ⎥, A = ⎢
⎥, φ = arctan 2.5
⎣ − sin φ cos φ ⎦
⎣θ1 ⎦
⎣ x1 ⎦
⎡ 0.37139068 0.92847669⎤
A=⎢
⎥
⎣- 0.92847669 0.37139068⎦
3
Motivating Example:
Transformed Sequence
4
Vassil Roussev
2
12: Transform Coding
CSCI 6990 Data Compression
Motivating Example:
Compression Step
Throw
away the second coordinate …
For fixed coding, that’s 50% reduction!
5
Motivating Example:
Reconstructed Sequence
Original
Reconstructed
6
Vassil Roussev
3
12: Transform Coding
CSCI 6990 Data Compression
Motivating Example:
Error Analysis
{xˆn } ≡ reconstructed sequence
⎧θ i = 0, 2, 4,K
θˆi = ⎨ i
⎩0
N −1
otherwise
N −1
∑ ( x − xˆ ) = ∑ (θ
i =0
2
i
i
i =0
i
− θˆi ) 2
Error depends on the magnitude of the θn members set to zero
¾ If magnitude is small, so is the error
¾ I.e., most information is the first element of each pair
7
Motivating Example:
Statistical View
Result:
¾ Max compaction is achieved when the transform decorrelates the sequence
Principle Component Method
¾ I.e., sample‐to‐sample correlation is zero.
8
Vassil Roussev
4
12: Transform Coding
CSCI 6990 Data Compression
Transform Coding
Transformation
¾
¾
Divide original sequence {xn} into blocks of size N
Map each block into a transform sequence {θn}
Using a reversible mapping
Quantization, based on
¾
Desired average bit rate
Statistical properties of transformed sequence
¾
Distortion
¾
Different techniques may be used for different subsequences
Entropy coding
¾
Fixed‐rate, Huffman, AC, RLE+AC, …
9
The Transform
Linear forward transform:
θ n = ∑i =0 xi an ,i
N −1
The characteristics of each element of {θn} depend on
its position
¾
¾
Design
¾
¾
E.g. odd vs. even elements in motivating example
This may not be true of {xn}
The variance of the transform sequence determines coding scheme
N is domain specific and is based on practical considerations
Reconstruction
x = ∑i =0 θi bn ,i
N −1
10
Vassil Roussev
5
12: Transform Coding
CSCI 6990 Data Compression
The Transform (2)
[A ]i , j = ai , j
[B]i , j = bi , j
θ = Ax
x = θB
2D transform
AB = BA = I
N −1 N −1
Θk ,l = ∑∑ X i , j ai , j ,k ,l
i =0 i =0
Separable 2D transforms
N −1 N −1
Θk ,l = ∑∑ ak ,i X i , j ai , j
i =0 i =0
11
The Transform (3)
Θ = AXA T
X = BΘΘT
Orthonormal transforms
B = A −1 = A T
X = A T ΘA
¾ Energy preservation property:
∑
N −1
i =0
θi2 = θ T θ = ( Ax )T Ax = xT A T Ax = xT x = ∑i =0 xi2
N −1
12
Vassil Roussev
6
12: Transform Coding
CSCI 6990 Data Compression
Energy Compaction
Transform coding gain
GTC
1
= N
N
∑
(Π
N −1
2
σ
i
i =0
σ i2 )
N −1
i =0
σ i2 : variance of the i th coeff θi
13
Decomposition View of Transforms
⎡a ⎤
⎡ x0 ⎤ ⎡a00 a10 ⎤ ⎡θ 0 ⎤
⎡a ⎤
× ⎢ ⎥ = θ 0 ⎢ 00 ⎥ + θ1 ⎢ 10 ⎥
⎥
⎢ x ⎥ = ⎢a
⎣ 1 ⎦ ⎣ 01 a11 ⎦ ⎣θ1 ⎦
⎣ a01 ⎦
⎣ a11 ⎦
Transform rows = basis vectors
Example
¾
¾
First row: ‘low‐pass’ signal
Second: ‘high‐pass’ signal
A=
1
2
1⎤
⎡1
⎢1 − 1⎥
⎣
⎦
1⎤ ⎡α ⎤ ⎡ 2α ⎤
⎡θ0 ⎤ 1 ⎡1
×
=⎢
⎥
⎢θ ⎥ =
⎢
2 ⎣1 − 1⎥⎦ ⎢⎣α ⎥⎦ ⎣ 0 ⎦
⎣ 1⎦
14
Vassil Roussev
7
12: Transform Coding
CSCI 6990 Data Compression
Filter Example
Consider two sequences:
¾
¾
‘low pass’: (3, 1) ‘high pass’: (3, ‐1)
1
2
1⎤ ⎡3⎤ ⎡2 2 ⎤
⎡1
⎥
⎢1 − 1⎥ × ⎢1⎥ = ⎢
⎣
⎦ ⎣ ⎦ ⎣ 2⎦
1⎤ ⎡ 3 ⎤ ⎡ 2 ⎤
1 ⎡1
×
=⎢
⎥
⎢
2 ⎣1 − 1⎥⎦ ⎢⎣− 1⎥⎦ ⎣2 2 ⎦
15
Matrix View
⎡ a00
⎢a
A = ⎢ 10
⎢ M
⎢
⎣a N 0
a01 L a0 N ⎤
a11 L a1N ⎥
⎥
M O M ⎥
⎥
a N 1 L a NN ⎦
⎡ ai0aj0 ai0aj1
⎡ ai0 ⎤
⎢aa
⎢a ⎥
ai1aj1
i1 j0
i1 ⎥
⎢
[
αi, j =
aj0 aj1 L ajN−1] = ⎢
M
⎢ M
⎢ M ⎥
⎢
⎢ ⎥
⎣aiN−1⎦
⎣aiN−1aj0 aiN−1aj1
ai0ajN−1 ⎤
ai1ajN−1 ⎥
⎥
O
M
⎥
⎥
L aN−1N−1aN−1N−1⎦
L
L
16
Vassil Roussev
8
12: Transform Coding
CSCI 6990 Data Compression
Matrix View (2)
1 ⎡1 − 1⎤
1⎥⎦
⎣
1 ⎡1 1⎤
α 0, 0 = ⎢
2 ⎣1 1⎥⎦
1⎡ 1
α 0,1 = ⎢
2 1
1⎤
1 ⎡ 1 − 1⎤
1⎥⎦
α1,0 = ⎢
α1,1 = ⎢
2 ⎣ − 1 − 1⎥⎦
2 ⎣− 1
⎡ x00
⎢x
⎣ 10
x01 ⎤ 1 ⎡1 1 ⎤ ⎡θ 00 θ 01 ⎤
=
=
x11 ⎥⎦ 2 ⎢⎣1 − 1⎥⎦ ⎢⎣θ10 θ11 ⎥⎦
= θ 00α 0,0 + θ 01α 0,1 + θ10α1,0 + θ11α1,1
DC coefficient
AC coefficients
Basis matrices
17
Karhunen-Loéve Transform (KLM)
A.k.a. Hotelling Transform
Consists of the eigenvectors of the autocorrelation
matrix:
[R]i,j = E [XnXn+|i-j|]
Minimizes the geometric means of the variance of the
transform coefficients
Î KLM provides maximal GTC
Q: Why do anything else?
¾
¾
¾
Computing KLM is relatively expensive
For (relatively) stationary input, KLM could work
For most input, however, KLM would have to recomputed/communicated frequently
18
Vassil Roussev
9
12: Transform Coding
CSCI 6990 Data Compression
KLM Example
Eigenvalues:
⎡ R (0) Rxx (1) ⎤
R = ⎢ xx
⎥
⎣ Rxx (1) Rxx (0)⎦
λI − R = 0, λ1 = Rxx (0) + Rxx (1), λ2 = Rxx (0) − Rxx (1)
Eigenvectors:
Normalization:
KLM matrix:
⎡α ⎤
⎡ β ⎤
V1 = ⎢ ⎥ V2 = ⎢ ⎥
⎣α ⎦
⎣− β ⎦
α = β =1 2
R=
1
2
1⎤
⎡1
⎢1 − 1⎥
⎣
⎦
19
Discrete Cosine Transform
Derivative of DFT
¾ Better suited for compression
20
Vassil Roussev
10
12: Transform Coding
CSCI 6990 Data Compression
DCT Basis Vectors
21
DCT Basis Matrices
22
Vassil Roussev
11
12: Transform Coding
CSCI 6990 Data Compression
DFT vs. DCT
DFT:
DCT:
23
DCT Properties
For Markov processes:
ρ=
E [xn xn +1 ]
E xn2
[ ]
As ρ gets large, DCT approaches KLM compaction
In practice, many sources are Markov
Î DCT is a popular choice:
¾
¾
¾
Vassil Roussev
JPEG
MPEG
H.261
…
24
12
12: Transform Coding
CSCI 6990 Data Compression
Discrete Sine Transform (DST)
Complimentary properties to DST:
¾
As ρ gets small, DST approaches KLM compaction
25
Discrete Walsh-Hadamard Transform
DWHT
Hadamard matrix of H order N:
HHT = N I
Construction rules for N = 2k:
⎡H
H1 = [1], H 2 N = ⎢ N
⎣H N
1⎤
⎡1
H2 = ⎢
⎥
⎣1 − 1⎦
Vassil Roussev
HN ⎤
− H N ⎥⎦
1
1
1⎤
⎡1
⎢1 − 1
1 − 1⎥
⎢
⎥
H4 =
1 − 1 − 1⎥
⎢1
⎢
⎥
1⎦
⎣1 − 1 − 1
…
26
13
12: Transform Coding
CSCI 6990 Data Compression
DWH Transform
Sequency of a row:
¾
Half the number of sign changes
Deriving transform matrix H from Hadamard matrix HN:
¾
¾
Normalize HN: Î multiply by
Place rows in sequency order:
E.g.:
Performance
¾
¾
1
N
Very easy to implement on constraint hardware
Overall, substantially less compaction than DCT
27
Coding of Transform Coefficients
Basic observation
¾
Different coefficients carry different amounts of information
E.g., recall motivating example
Î We should use different quantization/coding schemes to take advantage
Two approaches to code assignment
¾
¾
Optimization
Recursive
28
Vassil Roussev
14
12: Transform Coding
CSCI 6990 Data Compression
Optimization Approach
Average bit rate: R
R=
1
M
M
∑R
k
k =1
Quantizer input: θk
kth reconstruction error: rk
σ r2 = α k 2 −2 R σ θ2
k
k
k
σ r2 = ∑k =1α k 2 −2 R σ θ2
M
k
k
29
Optimization Approach (2)
Objective
¾
Find Rk such that σr2 is minimized
Phrase as a Lagrange multiplier problem:
¾
Assume α = αk for all k
M
1
⎛
J = α ∑k =1 2 −2 Rk σ θ2k − λ ⎜ R −
M
⎝
Rk =
(
M
k =1
⎞
Rk ⎟
⎠
)
1
1
log2 2α ln 2σ θ2k − log2 λ
2
2
λ = ∏k =1 (2α ln 2σ θ2
M
k
Vassil Roussev
∑
)
1
M
2 −2 R
30
15
12: Transform Coding
CSCI 6990 Data Compression
Optimization Approach (3)
1
Rk = R + log2
2
σ θ2
k
∏ (σ )
M
k =1
2
θk
1
M
Comments
¾
Rk will minimize σr2: 9
¾
Not guaranteed integer: 8
¾
Not guaranteed positive: 8
Workarounds:
¾
¾
Ignore negatives
Uniformly reduce Rk
31
Recursive Algorithm
(Zonal Sampling)
32
Vassil Roussev
16
12: Transform Coding
CSCI 6990 Data Compression
Example: Allocation for 8x8 Transform
33
Threshold Coding
Zonal sampling observation
Bit allocations based on average value
Î Local variation may not be reconstructed properly
¾ E.g., edge pixel representation
¾
Threshold coding
¾
All coefficients above given threshold are quantized & coded
Typical approach
¾
¾
Always code first (DC) coefficient
Threshold code the rest: <coeff, preceding zeroes count>
EOB
34
Vassil Roussev
17
12: Transform Coding
CSCI 6990 Data Compression
Zigzag 2D Block Traversal
35
JPEG: Initial Processing
RGB Æ YUV mapping (later)
4:2:2 sub-sampling (later)
Assume p-bit encoded image
¾
¾
¾
(For color images there are three planes: Y, U, V)
Level shifting: Xi,j = Xi,j – 2p‐1
E.g., 0..255 Æ ‐128…127
Split pixels into 8x8 blocks
¾
Last row/column replicated to achieve multiple of 8
Added data is discarded during the decoding
Apply forward DCT
36
Vassil Roussev
18
12: Transform Coding
CSCI 6990 Data Compression
Example
Original image
block
Block after
level shifting
&
DCT
37
JPEG: Quantization
Midtread quantization
Quantized values referred to as labels
Table representation
¾
E.g.:
⎢θ
⎥
+ 0.5⎥
lij = ⎢ ij
Q
ij
⎣
⎦
38
Vassil Roussev
19
12: Transform Coding
CSCI 6990 Data Compression
JPEG: Quantization Example
θ00
Q00
θ 00
l00 = ⎢
⎢⎣
Q00
⎣
⎦
+ 0.5⎥ = 39.88 + 0.5 = ⎣2.9925⎦ = 2
16
⎥⎦
l00
39
JPEG Quantization Tables: Nikon D40
Source: http://www.impulseadventure.com/
40
Vassil Roussev
20
12: Transform Coding
CSCI 6990 Data Compression
JPEG: Quantization (2)
Observations
¾
¾
¾
Usually, only a few non‐zero elements
Quantization table effectively works as threshold operation
By varying quantization table we can vary bit rates
Lower values Î fewer zeroes, less QE Îhigher rates/quality
Higher values Î more zeroes, higher QE Î lower
rates/quality
¾
Quality factor
Various scales: 0-4, 0-100
Implemented as quantization table multiplier
41
Coding
DC/AC coefficients are coded differently
¾
DCs are difference‐coded from each other
Using Huffman
¾
ACs are encoded as a sequence
RLE + Huffman/AC
I.e.
¾
DC0, AC00, AC10, …, AC630, DC1‐DC0, AC01, AC11, …, AC631
DC coding
¾
Unary code for row/category | binary code for column 42
Vassil Roussev
21
12: Transform Coding
CSCI 6990 Data Compression
DC Coding Table
43
AC Coding Table
44
Vassil Roussev
22
12: Transform Coding
CSCI 6990 Data Compression
Reconstruction
Follows the reverse process
¾
¾
¾
¾
¾
¾
¾
Decode Huffman data
Decode DC differences
Reconstruct quantized coefficients
Apply the inverse DCT
Drop padding rows/columns (if applicable)
Reverse shifting
Interpolate missing UV components (reverse sub-sampling)
¾
YUV Æ RGB
45
Reconstruction Example: Coefficients
Original
DCT
coefficients
Reconstructed
DCT
coefficients
46
Vassil Roussev
23
12: Transform Coding
CSCI 6990 Data Compression
Reconstruction Example: Block Data
Original
pixel
values
Reconstructed
pixel
values
47
Image Examples
0.56 bits/pixel
0.14 bits/pixel
48
Vassil Roussev
24
12: Transform Coding
CSCI 6990 Data Compression
Image Examples (2)
0.07 bits/pixel
0.035 bits/pixel
49
The Modified DCT (MDCT)
Observation
¾
Block‐based transforms introduce distortion at block boundaries
Idea
¾
Use overlapping regions to overcome this effect:
50
Vassil Roussev
25
12: Transform Coding
CSCI 6990 Data Compression
MDCT (2)
Applications
¾
Audio: mp3, ACC, Ogg Vorbis
Problem
¾
¾
Twice as many coefficients as samples
With some math, this problem can be avoided
51
Problems & Extra
Preparation problems (Sayood 3rd, pp.421-422)
¾
¾
Minimum: 2, 3
Recommended: 5
Extra
¾
MDCT derivation /13.7/ 52
Vassil Roussev
26
© Copyright 2026 Paperzz