Achieving the Shannon Limit: A Progress Report Robert J. McEliece

Achieving the Shannon Limit: A Progress Report
Robert J. McEliece
California Institute of Technology
Pasadena, California, USA
+
+
+
+
+
+
Thirty-Eighth Allerton Conference
October 5, 2000
(A Tale of Two Claudes)
Shannon
Berrou
How Hard is it to Achieve Channel Capacity?
• Channel capacity = C.
• Desired code rate = C(1 − ). For example, = .01 means
R = .99C.
• Desired decoder error probability = π. For example, π =
10−7 .
• χE (, π) = the encoding complexity, operations per information bit.
• χD (, π) = the decoding complexity, operations per information bit.
We are interested in the behavior of χE (, π), and especially
χD (, π), for fixed π, as → 0.
The Classical Results.
Theorem A. On a discrete memoryless channel of capacity
C, for any fixed π > 0, for the Shannon-Gallager ensemble
of rate R = C(1 − ), as → 0,
χE (, π) = O(1/2 )
O(1/2 )
χD (, π) = 2
.
The Classical Results.
Theorem A. On a discrete memoryless channel of capacity
C, for any fixed π > 0, for the Shannon-Gallager ensemble
of rate R = C(1 − ), as → 0,
χE (, π) = O(1/2 )
O(1/2 )
χD (, π) = 2
.
Proof: Use linear codes with (per-bit) encoding complexity O(n), and ML decoding with complexity 2O(n) . To estimate n, use the random coding exponent:
−nEr (R)
π≤e
Er(R)
R
C
and the fact that
Er (C(1 − )) ≈ K2
as → 0.
An Improvement for the
Binary Erasure Channel.
0
1-p
0
p
?
p
1
1-p
1
Theorem B. For the binary erasure channel, χD can be
improved to
χD (, π) = O(1/4 ),
for fixed π, as → 0.
Proof: Decode with (per-bit) complexity O(n2 ) by solving
linear equations for the erased positions.
Pre-1993 State of the Art on the AWGN Channel
Π
RS
Encoder
Convol.
Encoder
Interleaver
10
-1
Uncoded
10
-2
Voyager = (7,1/2)+(255,223)
Galileo (not S-band) = (15,1/4)+(255,223)
Pathfinder/Cassini= (15,1/6)+(255,223)
-3
10
-4
10
-5
BER
10
Galileo
(not S-band)
(15,1/2)+(255,223)
10
-6
Cassini
Voyager
10
-7
-0.5
0.0
0.5
1.0
1.5
2.0
Eb/No, dB
2.5
3.0
3.5
4.0
1993: And Then Came. . .
The Turbo-Era State of the Art on the AWGN Channel
10 -1
Turbo Codes
k=information block size
10 iterations
10
-2
Uncoded
rate=1/6
k=1784
rate=1/3
rate=1/6
k=8920 rate=1/3
k=8920
k=1784
Voyager = (7,1/2)+(255,223)
Galileo (not S-band) = (15,1/4)+(255,223)
Pathfinder/Cassini = (15,1/6)+(255,223)
BER
10 -3
10 -4
10 -5
Galileo
(not S-band)
(15,1/2)+(255,223)
10 -6
Cassini
Voyager
10
-7
-0.5
0.0
0.5
1.0
1.5
2.0
Eb/No, dB
2.5
3.0
3.5
4.0
1997: Another Landmark Paper (For the BEC)
1963: The Grandaddy of Them All:
The Secret is Iterative
Message-passing (Turbo) Decoding
A Low-Complexity Approximation
To “Exact” Decoding.
Codes that can be Decoded
in the “Turbo-Style”
• Classical turbo codes:
encoder 1
(IIR)
Interleaver
Π
encoder 2
(IIR)
• “Serial” turbo codes:
encoder 1
Π
Interleaver
encoder 2
(IIR)
Codes that can be Decoded
in the “Turbo-Style”
• Gallager codes (Low-Density Parity-Check), regular and
irregular:
codeword symbol nodes
parity check nodes
+
Π
+
+
Interleaver
Codes that can be Decoded
in the “Turbo-Style”
• And ...
(Dramatic Pause)
Repeat-Accumulate (RA) Codes
(nonsystematic)
rate 1/q
repetition
Π
Interleaver
rate 1
1/(1+D)
Repeat-Accumulate (RA) Codes
(nonsystematic)
Π
rate 1/q
repetition
rate 1
1/(1+D)
Interleaver
+
+
+
+
+
Tanner Graph Representation
(k = 2, q = 3)
+
Decoding an RA Code
Using Message Passing
+
+
+
+
+
+
Decoding an RA Code
Using Message Passing
+
+
+
+
+
+
Decoding an RA Code
Using Message Passing
+
+
+
+
+
+
Decoding an RA Code
Using Message Passing
+
+
+
+
+
+
Decoding an RA Code
Using Message Passing
+
+
+
+
+
+
Decoding an RA Code
Using Message Passing
+
+
+
+
+
+
Decoding an RA Code
Using Message Passing
+
+
+
+
+
+
Decoding an RA Code
Using Message Passing
+
+
+
+
+
+
Decoding an RA Code
Using Message Passing
+
+
+
+
+
+
What are the Messages?
m
+
x
m
+
x
p(x = 0)
m = log
.
p(x = 1)
How Messages are Updated
At Variable Nodes
m1
m2
m
mk
m = m1 + m2 + · · · + mk .
How Messages are Updated
At Check Nodes
m1
m2
mk
m
+
m = m1 m2 · · · mk
m
m1
m2
mk
tanh( ) = tanh(
) tanh(
) · · · tanh(
)
2
2
2
2
Decoding an RA Code on the BEC
Using Message Passing
−→= 1, −→= 0
+
+
+
+
+
+
Decoding an RA Code on the BEC
Using Message Passing
−→= 1, −→= 0
+
+
+
+
+
+
Decoding an RA Code on the BEC
Using Message Passing
−→= 1, −→= 0
+
+
+
+
+
+
Decoding an RA Code on the BEC
Using Message Passing
−→= 1, −→= 0
+
+
+
+
+
+
Decoding an RA Code on the BEC
Using Message Passing
−→= 1, −→= 0
+
+
+
+
+
+
Decoding an RA Code on the BEC
Using Message Passing
−→= 1, −→= 0
+
+
+
+
+
+
Decoding an RA Code on the BEC
Using Message Passing
−→= 1, −→= 0
+
+
+
+
+
+
What is the Complexity of Iterative
Message-Passing Decoding?
• Complexity per iteration:
χIT
E
=2 ,
k
where E is the number of edges in the Tanner graph, and
k is the number of information bits (χIT is an ensemble
invariant).
• N (, π) = Number of iterations needed to achieve error
probability π.
χD (, π) = χIT · N (, π).
One Interesting Point
-1
10
-2
10
-3
BER
10
10
Galilieo (15, 1/4) + (255,223)RS
χ = 50,000 messages per
decoded bit
-4
Rate 1/4 RA code (k = 4096)
χ = 500 messages per decoded
bit
10
-5
10
-6
10
-7
-0.5
0.0
0.5
1.0
1.5
2.0
Eb/No, dB
2.5
3.0
3.5
4.0
Theory is Available!
Analysis and Design of RA Codes
(The Fine Print)
• The ensemble of RA codes satisfies the RU condition:
For any fixed L, the probability that the depth-L neighborhood of a randomly selected edge contains a cycle goes to
zero as k → ∞.
• Therefore L-fold density evolution gives the limiting value
(k → ∞) of the ensemble bit error probability after L iterations. This limiting value will depend on the “noise parameter” of the channel. The largest noise parameter for
which the limiting bit error probability is zero is called the
ensemble noise threshold.
Example: Density Evolution for RA Codes on the BEC
(xi = prob. message is an erasure)
x
+
+
x
+
+
x
x1 = 1 − (1 − x0 )2 .
+
+
Example: Density Evolution for RA Codes on the BEC
(xi = prob. message is an erasure)
x
x
+
+
x
+
+
x2 = xq−1
.
1
+
+
Example: Density Evolution for RA Codes on the BEC
(xi = prob. message is an erasure)
x
+
+
+
+
+
x
+
x
x3 = 1 − (1 − x0 )(1 − x2 ).
Example: Density Evolution for RA Codes on the BEC
(xi = prob. message is an erasure)
+
+
x
+
+
x
p
x0 = x3 p.
+
+
Summary:
(L)
If xi
denotes the value of xi on the Lth iteration, then
(L)
= 1 − (1 − x0 )2
(L)
= (x1 )q−1
x1
x2
(L)
x3
(L+1)
x0
(L)
(L)
= 1 − (1 −
=
(L)
x0 )(1
−
(L)
x2 )
(L)
x3 p.
Thus we have the recursion
(L+1)
x0
(L)
= fq (x0 ),
where
fq (x) = p(1 − (1 − x)(1 − (1 − (1 − x)2 )q−1 )).
The Erasure threshold for R = /
RA codes is p = .
1
p = 0.8
0.8
p = .703
f4(x)
p = 0.70
0.6
0.4
0.2
0.2
0.4
0.6
x
0.8
1
“Irregular” Repeat-Accumulate Codes
(systematic)
Check Nodes
Information
Nodes
(All check nodes
have degree a)
(fi = Fraction of
nodes of degree i)
f2
q=
qfq
f3
INTERLEAVER
q≥2
q −1
R = (1 + )
a
1
χIT = 2( − 1)(a + 2).
R
fJ
Parity Nodes
R = / IRA Codes
for the Binary Erasure Channel
Decoding Complexity
4
4.5
x 10
a=4;f2=1
3.5
e
Messages per Decoded Bit (P =10-4 )
4
3
2.5
2
1.5
1
0.5
x
0
0.1
0.15
0.2
0.25
Erasure Probability
0.3
S.L.(1/3)
R = / IRA Codes
for the Binary Erasure Channel
Degree Sequence (a=5)
4
0.7
4.5
Complexity
4
3.5
e
messages per Decoded Bit (P =10-4 )
0.6
0.5
Fraction of Nodes
x 10
0.4
0.3
0.2
0.1
3
2.5
2
1.5
1
0.5
x
0
1
2
3
4
Degree
0
0.1
0.15 0.2 0.25 0.3 S.L.(1/3)
Erasure Probability
R = / IRA Codes
for the Binary Erasure Channel
Degree Sequence (a=6)
4
0.7
4.5
Complexity
4
Messages per Decoded Bit (Pe=10-4 )
0.6
0.5
Fraction of Nodes
x 10
0.4
0.3
0.2
0.1
3.5
3
2.5
2
1.5
1
0.5
x
0
1
2
3 4 5
Degree
6
7
0
0.1
0.15 0.2 0.25 0.3 S.L.(1/3)
Erasure Probability
R = / IRA Codes
for the Binary Erasure Channel
Degree Sequence (a=7)
4.5
0.45
4
Messages per Decoded Bit (Pe=10-4 )
0.5
0.4
0.35
Fraction of Nodes
4
0.3
0.25
0.2
0.15
0.1
x 10
Complexity
3.5
3
2.5
2
1.5
1
0.5
0.05
x
0
0
2
4
6 8 10 12 14
Degree
0
0.1
0.15 0.2 0.25 0.3 S.L.(1/3)
Erasure Probability
R = / IRA Codes
for the Binary Erasure Channel
Degree Sequence (a=8)
4
4.5
0.4
4
Complexity
-4
Messages per Decoded Bit (P =10 )
0.45
x 10
3.5
Fraction of Nodes
e
0.35
0.3
0.25
0.2
0.15
0.1
0.05
3
2.5
2
1.5
1
0.5
x
0
0
5
10
Degree
15
20
0
0.1
0.15 0.2 0.25 0.3 S.L.(1/3)
Erasure Probability
R = / IRA Codes
for the Binary Erasure Channel
Degree Sequence (a=9)
4
4.5
0.4
4
Messages per Decoded Bit (P =10-4 )
0.45
Complexity
3.5
Fraction of Nodes
e
0.35
x 10
0.3
0.25
0.2
0.15
0.1
0.05
3
2.5
2
1.5
1
0.5
x
0
0
10
20
Degree
30
0
0.1
0.15 0.2 0.25 0.3 S.L.(1/3)
Erasure Probability
R = / IRA Codes
for the Binary Erasure Channel
4
4.5
x 10
a=4 IRA Code
a=5 IRA Code
a=6 IRA Code
a=7 IRA Code
a=8 IRA Code
a=9 IRA Code
-4
messages per Decoded Bit (Pe=10 )
4
3.5
3
2.5
2
1.5
1
0.5
x
x
0
0.1
0.15
0.2
0.25
Channel Erasure Probability p
x
0.3
x
x x
S.L.(1/3)
A New Result
Theorem C. For the binary erasure channel, for IRA
codes,
1
χE (, π) = O( )
1
1
χD (, π) = O( 2 log )
Conjecture D. For the binary erasure channel, for IRA
codes,
1
χE (, π) = O(log )
1
1
χD (, π) = O( log )
Note: We can prove that χD (, π) = O( 1 log 1 ) for irregular LDPC codes.
R = / IRA Codes for the AWGN Channel
4
-5
Messages per Decoded Bit (Pe=10 )
6
x 10
a=2;f2=0.278,f3=0.296,f6=0.426
5
4
3
2
1
0
-0.5
(S.L.)
0
x
0.5
1
1.5
Eb/N0 in dB
2
2.5
3
R = / IRA Codes for the AWGN Channel
4
6
x 10
a=3;f2=0.235,f3=0.256,f5=0.193
f =0.036,f =0.055,f =0.225
12
13
2
2.5
5
e
Messages per Decoded Bit (P =10-5 )
6
4
3
2
1
0
-0.5
(S.L.)
x
0
0.5
1
1.5
Eb/N0 in dB
3
R = / IRA Codes for the AWGN Channel
4
6
x 10
a=4;f2=0.218,f3=0.278,f6=0.169,f10=0.184
f =0.012,f =0.134,f =0.005
27
28
5
e
Messages per Decoded Bit (P =10-5 )
11
4
3
2
1
0
-0.5
(S.L.)
x
0
0.5
1
1.5
Eb/N0 in dB
2
2.5
3
R = / IRA Codes for the AWGN Channel
4
-5
Messages per Decoded Bit (Pe=10 )
6
x 10
a=2 IRA Code
a=3 IRA Code
a=4 IRA Code
5
4
3
2
1
0
-0.5
(S.L.)
x x
0
x
0.5
1
1.5
Eb/N0 in dB
2
2.5
3
A Conjecture
Conjecture E. For any discrete memoryless channel, there
exists a sequence of ensembles plus matched iterative decoding algorithms, such that for any fixed π, as → 0,
1
χE (, π) = O(log )
1
1
χD (, π) = O( log )
Variations on the Theme
• Irregular Turbo Codes (Frey and MacKay)
• Asymmetric Turbo codes (Costello and Massey)
• Mixture Inner and/or outer codes (Divsalar and Dolinar)
• Doped Turbo codes (ten Brink)
• Irregular LDPC codes (Richardson, Shokrollahi and Urbanke)
• Finite Geometry LDPC codes (Fossorier and Lin)
• Concatenated Tree codes (Ping and Wu)
..
.
How We May Appear to Future Generations
Claude Shannon — Born on the planet Earth (Sol III) in
the year 1916 A.D. Generally regarded as the father of the
Information Age, he formulated the notion of channel capacity in 1948 A.D. Within several decades, mathematicians
and engineers had devised practical ways to communicate
reliably at data rates within 1% of the Shannon limit . . .
Encyclopedia Galactica, 166th ed.