Information theory

ETH Zürich
Dept. Computer Science
Spring Semester 2016
Information theory
Exercise 7 (Solution)
Hand out: 25 May 2016
7.1
Minimum distance decoding
We consider a binary code C = {~x1 , ~x2 , . . . ~xM } where ~xi ∈ {0, 1}n . Suppose that we are
communicating over a binary symmetric channel with cross over probability ε (denoted by
BSC(ε)), and that ε < 21 . That is, we have a message set W = {1, . . . , M } with Pr(W = i) =
1
th
xi , is sent through the BSC(ε)
M . Once the i message is chosen, its corresponding codeword, ~
and the channel outputs a sequence ~y , (y1 , . . . yn ). In this exercise, we want to design the
optimal decoder for this setting.
a) Design a decoder Ŵ which has the minimum possible probability of error (Hint: recall
the definition of the MAP decoder introduced in Exercise 6.3).
Solution:
The MAP decoder is defined as the
WMAP (y) = arg max p(~xi | ~y )
i∈{1,...M }
From Exercise 6.3 (b) we know that this decoder has the minimal possible error probability.
b)
Show that the decoder that you designed in part (a) is equivalent to the following
decoder:
ŴML (~y ) = arg max p(~y | i) = arg max p(~y | ~xi )
i∈{1,...M }
i∈{1,...M }
where the subscript “ML” is short for “maximal likelihood”, and p(~y | i) is short notation
for Pr(~y is received | i is chosen). Such decoding rule is called the maximum likelihood
decoding.
Solution:
y )p(~
y)
i |~
By the Bayes rule, we have p(~y | ~xi ) = p(~xp(~
. For every i ∈ {1, . . . , M }, we know
xi )
1
that p(~xi ) = M , and p(~y ) does not depend on the choice of i. Therefore the codeword
~xi that maximizes p(~xi | ~y ) also maximizes p(~y | ~xi ). Formally,
arg max p(~y | ~xi ) = M p(~y ) arg max p(~xi | ~y ) = arg max p(~xi | ~y )
i∈{1,...M }
i∈{1,...M }
i∈{1,...M }
Hence the ML decoder and the MAP decoder are equivalent.
c) Compute p(~y | ~xi ) in terms of ε and dH (~y , x~i ).
Solution:
Each bit of the output vector ~y differs from the input vector ~xi with probability ε.
Therefore, the likelihood term can be represented as
p(~y | ~xi ) = εdH (~xi ,~y) (1 − ε)n−dH (~xi ,~y)
d) Show that the decoder you designed in part (a) is equivalent to the following decoder:
ŴMD (~y ) = arg min dH (~y , x~i ).
i∈{1,...M }
where dH (~y , x~i ) denotes the Hamming distance between ~y and x~i . This decoder is called
the minimum-distance decoder. It is optimal when the transmission is over a BSC. Can
you intuitively justify what is this decoder doing?
Solution:
From part (c), we have
ŴML (~y ) = arg max p(~y | ~xi )
i∈{1,...M }
= arg max εdH (~xi ,~y) (1 − ε)n−dH (~xi ,~y)
i∈{1,...M }
= arg max
i∈{1,...M }
ε
1−ε
dH (~xi ,~y)
Since ε < 12 , we conclude that ŴML (~y ) = arg mini∈{1,...M } dH (~y , x~i ) = ŴMD (~y ).
For a given output ~y , this decoder chooses the minimum-distance codeword ~xi as input.
Intuitively, the codeword with minimum distance to the output can be interpreted as
the “closest” match with the output, hence has the minimal probability of error.
7.2
Error probability and its relation to distances
Consider a binary code C = {~x1 , ~x2 , . . . ~xM } where ~xi ∈ {0, 1}n .
a)
Consider transmission over BSC(ε), and denote the output of the channel by ~y ,
(y1 , . . . , yn ). Assume codeword ~xi is sent through the channel. For any j ∈ 1, · · · , M }
such that j 6= i show that
Pr [dH (~y , x~j ) < dH (~y , ~xi )] ≥ (ε(1 − ε))
dH (~
xi ,~
xj )+1
2
.
Solution:
Suppose that xj is the codeword that has minimum distance with codeword xi , i.e.,
j = arg min dH (~xi , ~xj ).
j∈{1,...,M }
j6=i
We have
di
X
Pr (dH (~y , x~j ) < dH (~y , ~xi )) =
i=b
di +1
c
2
di i
ε (1 − ε)di −i
i
≥ εb
di +1
c
2
(1 − ε)di −b
= εb
di +1
c
2
(1 − ε)d
≥ (ε(1 − ε))
di +1
c
2
di −1
e
2
di +1
2
b) We define for i ∈ {1, . . . , M } the integer di as follows:
di =
min
j∈{1,...,M }
j6=i
dH (~xi , ~xj ).
In other words, di is the minimum distance of all the codewords to ~xi . Show that for
any decoder Ŵ , we have that
n
Pe,i
= Pr(Ŵ 6= i | ~xi is sent) ≥ (ε(1 − ε))
di +1
2
.
Solution:
When codeword i is sent, and there exists another codeword j that has smaller distance
to the target, the minimum-distance decoder ŴM D will declare an error. Therefore,
Pr(ŴMD 6= i | ~xi is sent) = Pr (dH (~y , x~j ) < dH (~y , ~xi )) ≥ (ε(1 − ε))
di +1
2
.
For the previous exercise we know that the minimum-distance decoder is equivalent to
the MAP decoder, which achieves the minimal probability of error. That means that for
n = Pr(Ŵ 6= i | ~
any decoder, Pe,i
xi is sent) is lower-bounded by Pr(ŴMD 6= i | ~xi is sent),
and therefore we have
n
Pe,i
≥ Pr(ŴMD 6= i | ~xi is sent) ≥ (ε(1 − ε))
di +1
2
.
c) Show that for any decoder Ŵ
n
Pe,av
≥ (ε(1 − ε))
¯
d+1
2
1 PM
where d¯ = M
i=1 di (Hint: for number x1 , . . . , xM and any α > 0, we always have
P
P
i xi
M
1
xi
M .
i=1 α ≥ α
M
Solution:
n
Pe,av
=
M
X
i=1
n
p(~xi is sent) ×Pe,i
|
{z
}
1
pi = M
M
di +1
1 X
=
(ε(1 − ε)) 2
M
i=1
P
i xi
1 PM
n
xi ≥ α M
, we have Pe,av
≥ (ε(1−ε))
By the inequality M
α
i=1
which conclude the proof.
P di +1
i
2
M
= (ε(1−ε))
¯
d+1
2