A geometric proof of the polarization property

1
A geometric proof of the polarization property
arXiv:1706.06764v1 [cs.IT] 21 Jun 2017
Ilya Dumer
Abstract—We analyze one version of the successive cancellation
(SC) decoding that uses two random functions of the transmitted
symbols: their likelihoods and variations of posterior probabilities. The first function increases its expected value on the
upgrading channels, while the second does so on the degrading
channels. We show that both quantities can be bounded by sin θ
and cos θ of another random variable θ that ranges from 0 to
π/2. We then present a simple proof that shows that the expected
value of sin θ cos θ tends to 0 in the consecutive iterations of the
SC algorithm. This proves the polarization property of the SC
decoding.
Index terms: Polar codes; Reed-Muller codes; successive cancellation decoding; polarization.
I. I NTRODUCTION
In this paper, we analyze one algorithm of successive
cancellation (SC) decoding and give an elementary proof of
its polarizing behavior. This SC algorithm was first applied in
[1] to the general Reed-Muller codes RM (r, m) and showed
that these codes yield vastly different output bit error rates
(BER) for different information bits. This disparity was then
addressed [1] by eliminating some information bits with the
highest BERs. Simulation results of [2] showed that the
optimal selection of the eliminated (frozen) bits drastically
improves decoding of the original codes RM (r, m). However,
the analytical tools used in these and later publications [3]-[4]
do not reveal polarization properties of the bit-frozen subcodes
of codes RM (r, m) or their capacity reaching performance.
A major breakthrough in this area was achieved by E.
Arikan [5], who proved that the optimal bit-frozen subcodes
of the full codes RM (m, m) - now well known as polar codes
- achieve the channel capacity of any symmetric memoryless
channel as m → ∞. Paper [5] also proposes a new analytical
technique, which reveals some novel properties of generic
recursive processing, such as bit polarization. This technique
also yields the capacity-achieving subcodes originating from
codes RM (r, m) of rate R → 1, such as codes with
lim r/m > 1/2.
The goal of this paper is to apply some geometric concepts
and simplify the original Arikan’s proof [5] of polarization
properties of SC decoding. Some other proofs are also presented in [6] and [7]. The proof presented below does not
involve stochastic processes or information theory: it only
relies on the fact that sin θ, cos θ, and sin θ cos θ are concave
functions for any angle θ ∈ [0, π/2]. To do so, we give some
well known introductory material in Sections II and III. In
Section II, we describe polynomial codes based on the Plotkin
(u, u + v)-construction. Then in Section III, we describe conventional SC decoding using two different random variables
(rv). For any received symbol y, one rv is the likelihood
I. Dumer is with the College of Engineering, University of California,
Riverside, CA 92521, USA; email: [email protected]
h = P (0|y)/P (1|y). The other rv g = P (0|y) − P (1|y)
measures the variation between the posterior probabilities of
the transmitted symbols 0 and 1. We show that quantities g and
h can be recalculated respectively as products g1 g2 and h1 h2
on the degrading and upgrading channels of the SC decoding.
In Sections IV and V, we proceed with the Bhattacharyya
parameter Z and the expectation G of the variables g. As
pointed out by E. Arikan [10], parameter G is also studied in
statistics as the variational distance [11]. Section IV presents
some new inequalities for parameters G and Z. Our goal is to
show that lim GZ = 0 for almost all sequences of m → ∞
successive channel transformations (except for a fraction of
them that declines exponentially in m). To do so, in Section
V we re-consider parameters G and Z in terms of a single rv
θ ∈ [0, π/2]. We first show that these two parameters can be
bounded from above as the expectations of sin θ and cos θ.
We then proceed with some new inequalities, which reduce
the polarization problem to a much simpler
Problem A. Let an angle θ ∈ [0, π/2] be equally likely
transformed into one of two complementary angles: θ(0) =
arcsin(sin2 θ) or θ(1) = arccos(cos2 θ). Consider all sequences ξ ∈ Fm
2 of m consecutive random transformations.
Prove that angles θ(ξ) tend to 0 or π/2 for most sequences ξ
as m → ∞.
In Fig. 1, we give a preliminary illustration of Problem A
and depict three-step transformations of the angle θ = π/4
into angles θ(000) and θ(111) .
1.0
θ(111)
0.8
θ (11)
θ (1)
0.6
θ (0 )
0.4
θ(00)
0.2
0.0
Fig. 1.
Original θ
θ(000)
0.0
0.2
0.4
0.6
0.8
X
1.0
Three-step transformations of the angle θ = π/4.
In Section VI, we address Problem A and show that almost
all sequences ξ of m → ∞ transformations satisfy condition
sin θ(ξ) cos θ(ξ) → 0 for any original angle θ. This solves the
polarization problem.
II. R ECURSIVE P LOTKIN CONSTRUCTION
RM codes and polar codes can be designed using polynomial constructions. Consider any boolean polynomial f (x) ≡
2
f1111
x a4 4
f0000
:::fa1a2a3a4 :::
1
0
1
0
1
0
1
c111
x a3
3
0
1
c11
0
0
a1 :
1
1
1
0
0
0
0
0
1
0
1
1
RM(4; 4)
0
0
1
c0
1
ξ : x1x2
0
0
c00
1
c1
x 1a1
1
c10 c01
1
x 2a2
0
c000
1
η : x2x3x4
1
0
0
RM(5,5)
Fig. 2.
Decomposition of RM (4, 4)
Fig. 3.
f (x1 , . . . , xm ) for any x ∈ Fm
2 . We also consider sequences
(paths) ξ = (a1 , ..., am ) ∈ Fm
2 and define monomials
xξ ≡ xa1 1 · ... · xamm
Then any polynomial f (x) is decomposed as follows
X
X
f (x) =
xa1 1 fa1 (x2 , ..., xm ) = ... =
xa1 1 · ... · xa` `
a1 ,..,a`
a1 =0,1
· fa1 ,..,a` (x`+1 , ..., xm ) = ... =
X
fξ xξ
(1)
ξ
Any step ` = 1, ..., m − 1 ends with the incomplete paths
ξ1 ` ≡ (a1 , ..., a` ) that decompose the polynomial f (x) with
respect to the first ` variables. Finally, step m defines each bit
fξ associated with any path ξ and its monomial xξ .
Codes RM (r, m) consist of the maps f (x) : Fm
2 → F2 .
Here we take all polynomials f (x) of degree r and all vectors
x ∈ Fm
2 , which form positions of our code. Each map
generates a codeword
X
c = c(f ) =
fξ c xξ .
ξ
Here any vector c(xξ ) has weight 2m−w(ξ) , where w(ξ) is the
Hamming weight of ξ. Note that for a1 = 0, 1, two polynomials xa1 1 fa1 (x2 , ..., xm ) generate the codewords (c0 , c0 ) and
(0, c1 ). Then codewords c = c0 , c0 +c1 of code RM (r, m)
are formed by RM codes {c0 } and {c1 } of length 2m−1 .
This is the instance of the Plotkin (u, u + v) construction.
Similarly, we may further decompose codes RM (r, m) using
the Plotkin construction in each step ` = 2, ..., m − 1. The
Plotkin construction is also equivalent to the Arikan’s 2 × 2
kernel [5].
Decomposition (1) is shown in Fig. 2 for the code
RM (4, 4). Each decomposition step ` = 1, ..., 4 is marked
by the splitting monomial xa` ` . For example, path ξ = 0110
gives the information bit f0110 associated with the monomial
xξ ≡ x2 x3 .
Now consider some subset of k paths
T = {ξ(i), i = 1, ..., k} ⊂ Fm
2
Then we encode k information
bits via their paths and obtain
P
ξ
codewords c(T ) =
c(x
). These codewords form a
ξ∈T
linear code C(m, T ).
Subcode C(m, T ) of code RM(5,5)
Fig. 3 presents such a code C(m, T ). Here we use all paths
ξ 0 bounded on the left by the path ξ = 11000 (red dashed
line) and all paths η 0 bounded by the path η = 01110 (blue
dashed line). These two paths generate monomials x1 x2 and
x2 x3 x4 . All paths ξ 0 have weights w(ξ 0 ) ≤ 2 and form code
RM (2, 5). Similarly, paths η 0 have weights w(η 0 ) ≤ 3 in
variables x2 , ..., x5 . Thus, paths η 0 generate a repeated code
RM (3, 4). In turn, code C(m, T ) is the sum of the codes
generated by the boundaries ξ and η.
Construction C(m, T ) also leads to polar codes, which use
subsets T ⊂ Fm
2 optimized for the recursive SC decoding.
This algorithm is considered in the next section.
III. SC DECODING
Recursive decoding of the Plotkin construction. Below, we
consider transmission over a discrete memoryless channel W
with inputs ±1. To do so, we use also symbols (−1)a for
any binary input a = 0, 1. In particular, all-zero codeword 0n
is mapped onto 1n . The Plotkin construction has the form of
c = (u, uv) , where vector uv is the component-wise product
of vectors u and v with symbols ±1. For any received symbol
y, we will define three interrelated quantities: the posterior
probability (PP) q that a symbol c = 1 is transmitted, the
offset g, and the likelihood h. These quantities are defined as
follows:
q = q(y) = Pr{c = 1 | y}
g = 2q − 1, h = q/ (1 − q)
(2)
For example, let W be a binary symmetric channel BSC(ε)
with transition error probability p = (1 − ε)/2, where ε ∈
[0, 1]. Then any output y = ±1 gives quantities
g(y) = εy,
h(y) = (1 + εy)/(1 − εy).
(3)
Let c = (cj ) be any vector of even length µ. We use notation
j` and jr for positions of the left and right halves. Now let
y = (yj ) be the received vector corrupted by noise. We then
use vectors q = (qj ), g = (gj ) and h = (hj ) with symbols
defined in (2).
The following recursive algorithm of [2]-[4] performs SC
decoding of information bits in the recursive (u, uv) constructions, such as codes R(r, m) or their bit-frozen subcodes
C(m, T ). This algorithm is also identical to the conventional
SC decoder of [5]. We first wish to derive vector v in the
3
(1)
(u, uv) construction. To do so, we first find PP qj
symbol vj of vector v of length n/2,
for each
Algorithm Ψ(m, T ) for code C(m, T ).
Given: a vector q = (qj ) of PP.
Take i = 1, ..., k and ` = 1, ..., m.
(1)
qj
≡ Pr{vj = 1 | qj` , qjr }
For a path ξ(i) = a1 (i), ..., am (i) in step ` do:
(1)
The vector q(1) of PP qj represents the corrupted version of
vector v. Simple recalculations show that the corresponding
(1)
(1)
offsets gj = 2qj − 1 can be recalculated as
(1)
gj
= gj` gjr
(4)
Here indices j, j` and jr run through the same set 1, ..., n/2
(since the newly defined vectors g(1) have length n/2). We
e ∈ RM (r −
may now decode vector g(1) into some vector v
1, m − 1) of length n/2.
e , note that two symbols yj` and yjr vej
Given vector v
represent two corrupted versions of symbol uj in the (u, uv)
construction. Then symbol uj has likelihoods hj` and e
hjr =
v
e
(hjr ) j in the left and right halves, which gives its overall
likelihood
(0)
hj
= hj` e
hjr
(5)
(0)
e
We can now decode vector h(0) ≡ (hj ) into some vector u
∈ RM (r, m − 1). Observe also that recalculations (4) degrade
the original channel, whereas recalculations (5) upgrade it.
In the general setting, recalculations (4) and (5) form the
level ` = 1 of SC decoding. We then apply these recalculations to the new vectors q(1) and q(0) , which represent the
corrupted versions of vectors v and u, and proceed similarly
at any level ` = 2, ..., m. Any current path ξ = ξ1 ` receives
a PP-vector q(ξ) of length µ = 2m−` . Then we process the
v-extension (ξ, 1) using recalculations (4) on the two halves
of vector g(ξ) :
(ξ,1)
gj
(ξ)
(ξ)
= gj` gjr
(6)
By recursion, we now assume that the path (ξ, 1) returns its
e = v
e (ξ) to the node ξ. Similarly, we use
current output v
(ξ)
(ξ)
recalculations (5) with likelihoods hj` and e
hjr for the u(ξ,0)
extension h
(ξ,0)
hj
(ξ) (ξ)
hjr
= hj` e
(7)
Apply recalculations (6) if a` (i) = 1
Apply recalculations (7) if a` (i) = 0.
Output information bit fξ(i) if ` = m.
The above algorithm can be extended to the SC list decoding
that tracks L most probable code candidates throughout the
process and has complexity order of Ln log n. Simulation
results of [2]-[4] show that the optimized bit-frozen subcodes
substantially outperform the original RM codes in SC list
decoding and require much smaller lists. SC list decoding can
also be combined with precoding techniques, which can further
reduce the output BERs, as shown in [8].
IV. R ANDOM VARIABLES AND THEIR
TRANSFORMATIONS IN SC DECODING
Consider a code C(m, T ) = C(m, ξ) defined by a single
path ξ = (a1 , ..., am ) and let it be used over a discrete
memoryless symmetric (DMS) channel W. We now consider
a codeword 1n transmitted over this path and assume that
e (ξ) = 1
all preceding (frozen) paths give correct outputs v
in recursive recalculations (6) and (7). Then for every prefix
ξ = (a1 , ..., a` ), we can simplify recalculations (7) as follows
(ξ,0)
hj
(ξ)
(ξ)
= hj` hjr
(8)
Recalculations (6) and (8) essentially form a new DMS
channel W (ξ) : X → Y (ξ) that outputs a random variable (rv)
h(ξ) or g (ξ) starting from the original rv gj or hj . Following
[5], [9], we consider the compound channel W (ξ) as an
ensemble of some number k of binary symmetric channels
W (ξ) (t) = BSC(βt , εt ) that have transition error probabilities
pt = (1 − εt )/2
Pk and occur with some probability distribution
{βt }, where t=1 βt = 1. We use notation
W (ξ) = ∪kt=1 BSC(βt , εt )
Here the new parameters k, εt and βt depend on a specific
path ξ. We will use the expectation of the offsets εt over the
distribution {βt } :
Xk
G (ξ) = EGt =
βt ε t
(9)
t=1
Then the current vector h(ξ,0) can be decoded into some vector
e (ξ) . Thus, the v-extensions (marked with ones on Fig. 2)
u
always precede the u-extensions in each decoding step.
Finally, the last step gives the likelihood qξ = Pr{fξ =
0 | y} of one information bit fξ on the path ξ. We then choose
the more reliable bit fξ . Thus, the decoder recursively retrieves
every information symbol fξ moving back and forth along the
paths of Fig. 2 or Fig. 3. It is easy to verify [1] that the overall
complexity has the order of n log n.
Recursive decoding of polar codes.
Any subcode
C(m, T ) ⊂ RM (m, m) with k paths ξ(1), ..., ξ(k) is decoded
similarly. Here we simply drop all frozen paths ξ ∈T
/ that give
information bits fξ ≡ 0. This gives the following algorithm.
Recall from (3) that for any BSC(βt , εt ), symbols y give the
offsets g(y) = yεt . Also, recalculations (6) use the products
(ξ) (ξ)
gj` gjr of independent rvs in all steps `. Thus, any degrading
channel W (ξ,1) : X → Y (ξ,1) can be considered as the
ensemble of the new BSC channels:
W (ξ,1) = ∪t,s BSC(βt,s , εt εs )
where t, s = 1, ..., k and βt,s = βt βs . Then
X
G (ξ,1) =
βt,s (εt εs ) = [G (ξ) ]2
t,s
Next, consider the Bhattacharyya parameter [5]:
p
p
P
Z (ξ) = y∈Y (ξ) W (ξ) (y|0) W (ξ) (y|1)
(10)
(11)
4
For the BSC(βt , εt ), we obtain the Bhattacharyya parameters
1/2
1/2
1 + εt
1 − εt
1 − εt
1 + εt
zt =
+
1 − εt
2
1 + εt
2
2 1/2
= 1 − εt
(12)
(ξ)
Thus, the compound channel W
gives
X q
Z (ξ) = Ezt =
βt 1 − ε2t
t
(13)
1/2
For a BSC(βt , εt ) with zt = 1 − ε2t
, we will also use an
alternative notation BSC(βt r zt ). Similarly to (10), it is also
easy to verify that the upgrading channel W (ξ,0) : X → Y (ξ,0)
forms the ensemble
W (ξ,0) = ∪t,s BSC(βt,s r zt zs )
This gives an important Arikan’s identity [5], [12]
X
Z (ξ,0) =
βt,s zt zs = [Z (ξ) ]2
t,s
(14)
(15)
We will now relate parameters G (ξ) and Z (ξ) .
Lemma 1: For any channel W (ξ) ,
q
(16)
1 − G (ξ) ≤ Z (ξ) ≤ 1 − [G (ξ) ]2
√
1 − x2 is a concave function. Also,
Proof.
Note that
√
2
1 − x ≥ 1 − x for any x ∈ [0, 1]. Then the lower bound in
(16) follows from the definitions (9) and (13). We also apply
the Jensen inequality to (13) to obtain the upper bound.
Consider an ensemble of 2` equiprobable paths ξ =
(a1 , ..., a` ). Our main goal is to prove that for ` → ∞, most
paths ξ (with the exception of a vanishing fraction) achieve
polarization, so that
(G (ξ) , Z (ξ) ) → (0, 1)
(G (ξ) , Z (ξ) ) → (1, 0)
Lemma 2: For any channel W
hold for ` → ∞ if and only if
(ξ)
Proof. We first rewrite equalities (9), (13) and (18) in the
angular form using parameters θt . Then
X
X
G (ξ) = E(εt ) =
βt cos θt ≤ cos
βt θt = cos θ
t
X t
X
(ξ)
Z = E(zt ) =
βt sin θt ≤ sin
βt θt = sin θ
t
t
Here we apply the Jensen inequality for the concave functions
sin x and cos x with 0 ≤ x ≤ π/2.
Lemma 4: For the channels W (ξ,1) and W (ξ,0) ,
U (ξ,1) ≤ cos2 θ (1 − cos4 θ)1/2
(21)
2
4
(ξ,0)
1/2
U
≤ sin θ (1 − sin θ)
(22)
Proof. Consider the channel W (ξ,1) defined in (10). According
to (13) and (19),
hX
i hX
i
U (ξ,1) =
βt,s εt εs
βi,j (1 − ε2i ε2j )1/2
t,s
i,j
Here all indices i, j, t, s run from 1 to k. Note that
X
βt,s εt εs = E(εt εs ) = E2 (εt ) ≤ cos2 θ.
t,s
Also, (1 − ε2i ε2j )1/2 is a concave function of the variable x =
εi εj . Then
h
i1/2 X
1/2
2
βi,j (1−ε2i ε2j )1/2 ≤ 1 − [E(εi εj )]
= 1 − E4 (εi )
i,j
This proves (21). Similarly,
hX
i hX
U (ξ,0) =
βt,s zt zs
t,s
(17)
To prove this, we introduce a single function
U (ξ) = G (ξ) Z (ξ)
Lemma 3: For any compound channel W (ξ) , parameters
G and Z (ξ) satisfy relations
P
G (ξ) = Pt βt cos θt ≤ cos θ
(20)
Z (ξ) = t βt sin θt ≤ sin θ
(ξ)
i,j
βi,j (1 − zi2 zj2 )1/2
i
Then we obtain (22) by repeating the previous case.
VI. P ROOF OF POLARIZATION PROPERTY
(18)
, asymptotic equalities (17)
U (ξ) → 0
Proof. The “only if” part follows from the definition (18).
The “if” part follows from (16). Indeed, G (ξ) +Z (ξ) ≥ 1 and
G (ξ) , Z (ξ) ≤ 1. One of these two quantities tends to 0 if U (ξ)
→ 0. Then we obtain asymptotic equality G (ξ) +Z (ξ) → 1.
This gives (17).
V. P OLARIZATION PARAMETERS IN POLAR
The following theorem proves polarization property and also
solves Problem A of Section 1.
Theorem 1: Paths ξ = (a1 , ..., a` ) of any length ` satisfy
inequality
`
E U (ξ) ≤ (0.87)
(23)
Corollary 1: Most paths ξ = (a1 , ..., a` ), except the fraction
`/2
`/2
(0.87)
of them, satisfy inequality U (ξ) < (0.87)
and
yield polarization property U (ξ) → 0 as ` → ∞.
Proof. Consider the ensemble N of equiprobable paths ξ =
(a1 , ..., a` ). For each ξ, the bit a`+1 takes values 0 and 1
equally likely. Then the quantity U (ξ,a`+1 ) has the mean value
h
i
EU (ξ,a`+1 ) = U (ξ,0) + U (ξ,1) /2
COORDINATES
Given any DMS channel W (ξ) = ∪kt=1 BSC(βt , εt ) we now
define the angular parameters θt and their mean θ = θ(ξ) :
P
θt = arccos εt = arcsin zt , θ = Eθt = t βt θt
(19)
We have the following important lemmas.
For every ξ, we can now consider
a random angle Θ ∈ [0, π/2] that
complementary angles θ(0) and θ(1)
(
θ(0) = arcsin sin2 θ(ξ)
Θ=
θ(1) = arccos cos2 θ(ξ)
the angle θ = θ(ξ) and
equally likely takes two
such that
if a`+1 = 0
if a`+1 = 1
5
This is the setting of Problem A of Section 1. Now the upper
bounds (20), (21), and (22) give inequalities
U (ξ) ≤ r(θ),
EU (ξ,a`+1 ) ≤ r(Θ)
where
[4]
[5]
[6]
r(θ) = sin θ cos θ, r(Θ) = sin Θ cos Θ =
= [ sin2 θ (1 − sin4 θ)1/2 + cos2 θ (1 − cos4 θ)1/2 ]/2
Next, note that r(Θ)/r(θ) < 0.87 for any θ ∈ [0, π/2] with
the maximum at θ = π/4, as seen in Fig. 4. In turn, this
implies that for the paths ξ of length `, the function r(θ(ξ) )
has the expected value
[7]
[8]
[9]
[10]
[11]
`
E r(θ(ξ) ) < (0.87) r(π/4)
[12]
which completes the proof.
[13]
r(Θ) / r (θ)
0.85
[14]
0.80
[15]
0.75
0.70
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
θ
The ratio r(Θ)/r(θ) of functions r(θ) = sin θ cos θ and
r(Θ) = E( sin Θ cos Θ)
Fig. 4.
Remarks. A slightly stronger version of Corollary 1 shows
that almost all paths (except a vanishing fraction) satisfy
inequality U (ξ) ≤ c−`/2 , where we can take any c > 1/2 as
` → ∞. Indeed, it is easy to verify that r(Θ)/r(θ) → 2−1/2
as r(θ) → 0, which in turn holds for almost all paths ξ of
length log2 ` → ∞. More precise arguments, which use the
concave functions rλ (θ) of a vanishing degree λ > 0, show
that EU (ξ) has the order below c−` , where c > 1/2. Finally,
the above technique can be used to obtain fast polarization of
the order
h
i
log2 EU (ξ) < 2−`/2+f (`)
where f (`) is any function such that f (`)`−1/2 → ∞.
However, the proof of this fact is more involved and does
not simplify similar results of the papers [13] - [15].
Acknowledgment. The author thanks E. Arikan and I. Tal
for helpful comments.
R EFERENCES
I. Dumer, “Recursive decoding of Reed-Muller codes,” Proc. 37 th
Allerton Conf. on Commun., Cont., and Comp., Monticello, IL, USA,
1999, pp. 61-69 (http://arxiv.org/abs/1703.05303).
[2] I. Dumer and K. Shabunov, “Recursive constructions and their
maximum likelihood decoding,” Proc. 38th Allerton Conf. on Commun., Cont., and Comp., Monticello, IL, USA, 2000, pp. 71-80
(http://arxiv.org/abs/1703.05302).
[3] I. Dumer and K. Shabunov, “Near-optimum decoding for subcodes of
Reed-Muller codes,” 2001 IEEE Intern. Symp. Info. Theory, Washington
DC, USA, June 24-29, 2001, p. 329.
[1]
I. Dumer and K. Shabunov, “Soft decision decoding of Reed-Muller
codes: recursive lists,” IEEE Trans. Info. Theory, vol. 52, no. 3, pp.
1260-1266, 2006.
E. Arikan, “Channel polarization: a method for constructing capacityachieving codes for symmetric binary-input memoryless channels,” IEEE
Trans. Info. Theory, vol. 55, no. 6, pp. 3051-3073, 2009.
V. Guruswami and P. Xia, “Polar Codes: speed of polarization and
polynomial gap to capacity,” vol. 61, no. 1, pp. 3-16, 2015.
M. Alsan and E. Telatar, “A simple proof of polarization and polarization
for non-stationary memoryless channels,” IEEE Trans. Info. Theory, vol.
62, no. 9, pp. 4873-4878, 2016.
I. Tal and A. Vardy, “List decoding of polar codes,” IEEE Trans. Inform.
Theory, vol. 61, no. 5, pp. 2213–2226, 2015.
S. B. Korada, “Polar Codes for Channel and Source Coding,” Ph.D.
thesis, Ecole Polytechnique Federale De Lausanne, 2009.
E. Arikan, Private communication, March 2017.
J.
Duchi,
“Lecture
Notes
for
Statistics
311/Electrical
Engineering
377”,
Stanford
University,
2016,
https://stanford.edu/class/stats311/Lectures/full notes.pdf
T. S. Jayram and E. Arikan, “A note on some inequalities used in channel
polarization and polar coding,” To appear in IEEE Trans. Info. Theory,
2017.
E. Arıkan and E. Telatar, “On the rate of channel polarization,” in Proc.
IEEE Intern. Symp. Inform. Theory (ISIT’2009), Seoul, South Korea,
2009, pp. 1493–1495.
S. B. Korada, E. Sasoglu, and R. Urbanke, “Polar codes: characterization
of exponent, bounds, and constructions,” IEEE Trans. Inform. Theory,
vol. 56, no. 12, pp. 6253–6264, 2010.
I. Tal, “A simple proof of fast polarization,” available at
https://arxiv.org/abs/1704.07179, April 2017.