Chapter 4: Continuous channel and its capacity

Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Chapter 4: Continuous channel and its capacity
Vahid Meghdadi
[email protected]
Reference : Elements of Information Theory by Cover and
Thomas
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Differential entropy
Continuous random variable
Gaussian multivariate random variable
Capacity of Gaussian channel
AWGN
Band limited channel
Parallel channels
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Continuous random variable
Gaussian multivariate random variable
Continuous random variable
In the case where X is a continuous RV, how the entropy is
defined?
For discrete RV we used the mass probability function, here it is
replaced by probability distribution function (PDF).
Definition
The random variable X is said to be continuous if its cumulative
distribution function F (x) = Pr(X ≤ x) is continuous.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Continuous random variable
Gaussian multivariate random variable
Differential entropy
Definition
The differential entropy h(X ) of a continuous random variable X
with a PDF PX (x) is defined as
Z
1
dx
h(X ) =
PX (x) log
PX (x)
S
1
= E log
(1)
PX (x)
where S is the support set of the random variable.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Continuous random variable
Gaussian multivariate random variable
Example: Uniform distribution
P (x)
X
Show that for X ∼ U(0, a) the
differential entropy is log a.
1/a
a
Note Unlike discrete entropy, the differential entropy can be
negative. However, 2h(X ) = 2log a = a is the volume of the support
set, which is always non-negative.
Note A horizontal shift does not change the entropy.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
x
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Continuous random variable
Gaussian multivariate random variable
Example: Normal and exponential distribution
Show that for X ∼ N (0, σ 2 ) the differential entropy is
h(x) =
1
log(2πeσ 2 ) bits
2
Show that for PX (x) = λe −λx for X ≥ 0 the differential entropy is
h(x) = log
e
bits
λ
What is the entropy if PX (x) = λ2 e −λ|x| ?
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Continuous random variable
Gaussian multivariate random variable
Exercise
Suppose an additive Gaussian channel defined by Y = X + N with:
X ∼ N (0, PX ) and N ∼ N (0, PN ). Because of the independence
of X and N, Y ∼ N (0, PX + PN ).
Defining I (X ; Y ) = h(Y ) − h(Y |X ), show that
PX
1
I (X ; Y ) = log2 1 +
2
PN
Hint: You can use the fact that h(Y |X ) = h(N) (why?).
Actually this is the capacity of a noisy continuous channel.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Continuous random variable
Gaussian multivariate random variable
Gaussian random vector
Suppose that the vector X is defined as
X1
X =
X2
where X1 and X2 are i.i.d. N (0, 1). What is the entropy of X?
h(X) = h(X1 , X2 ) = h(X1 ) + h(X2 |X1 ) = h(X1 ) + h(X2 )
Therefore
1
log(2πe)2
2
And for a vector of dimension n:
h(X) =
h(X) =
Vahid Meghdadi
1
log(2πe)n
2
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Continuous random variable
Gaussian multivariate random variable
Some properties
1. Chain rule: h(X , Y ) = h(X |Y ) + h(Y )
2. h(X + cte) = h(X )
3. h(cX ) = h(X ) + log |c| (note that in discrete case,
H(cX ) = H(X ))
4. Let X be a random vector and Y = AX where A is a square
non singular matrix. Then h(Y) = h(X) + log |A|.
5. Suppose X is a random vector with E(X) = 0 and
E(XXT ) = K, then h(X) ≤ 12 log(2πe)n |K|. The equality is
achieved only if X is Gaussian ∼ N (0, K)
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
AWGN
Band limited channel
Parallel channels
Shannon capacity
In the early 1940s it was thought to be impossible to send
information at a positive rate with negligible probability of error.
Shannon showed that (1948):
I For every channel there exists a maximum information
transmission rate, below which, BER can be made nearly zero.
I If the entropy of source is less than channel capacity,
asymptotically error free communication can be achieved.
I To obtain an error free communication, a coding scheme
should be used.
I Shannon did not show the optimal coding.
I Today, the predicted capacity by Shannon can be achieved
within only a few tenth of dB.
For every channel there exists a maximum information transmission
rate, below which, the error probability can be made nearly zero.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
AWGN
Band limited channel
Parallel channels
Additive white Gaussian channel
As we have seen before with an additive Gaussian
noise channel, the mutual input-output
information can be calculated as
I (X ; Y ) = h(Y ) − h(Y |X ) = h(y ) − h(Z ) = h(Y ) −
1
log 2πeN
2
To maximize the mutual information, one should maximize h(Y )
with the power constraint of PY = P + N. The distribution
maximizing the entropy for a continuous random variable is
Gaussian. This can be obtain if X is Gaussian.
P
1
C=
max I (X ; Y ) = log 1 +
2
N
p(x):E X 2 ≤P
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
AWGN
Band limited channel
Parallel channels
Band limited channels
Suppose we have a continuous channel with bandwidth B and the
power spectral density of noise is N0 /2. So the analog noise power
is N0 B. On the other hand, supposing that the channel is used
over the time interval [0, T ]. So the power of analog signal times
T gives the total energy of the signal in this period. Using
Shannon sampling theorem, there are 2B samples per second. So
the power of discrete signal per sample will be PT /2BT = P/2B.
The same argument can be used for the noise, so the power of
T
= N0 /2. So the capacity of the
samples of noise is N20 2B 2BT
Gaussian channel per sample is:
1
P
C = log 1 +
bits per sample
2
N0 B
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
AWGN
Band limited channel
Parallel channels
Band limited channel capacity
Since there are maximum 2B independent samples per second the
capacity can be written as:
P
bits per second
C = B log 1 +
N0 B
Sometimes this equation is divided by B to obtain:
C
P
= log 1 +
bits per second per Hz
B
N0 B
It is the maximum achievable spectral efficiency through the
AWGN channel.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
AWGN
Band limited channel
Parallel channels
Parallel independent Gaussian channel
Here we consider k independent Gaussian channels
in parallel with a common power constraint. The
objective is to maximize the capacity by optimal
distribution of the power among the channels:
C=
max
pX1 ,...,Xk (x1 ,...,xk ):
P
EXi2 ≤P
I (X1 , ..., Xk ; Y1 , ..., Yk )
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
AWGN
Band limited channel
Parallel channels
Parallel independent Gaussian channel
Using the independence of Z1 , ..., Zk :
C
= I (X1 , ..., Xk ; Y1 , ..., Yk )
= h(Y1 , ..., Yk ) − h(Y1 , ..., Yk |X1 , ..., Xk )
= h(Y1 , ..., Yk ) − h(Z1 , ..., Zk )
X
≤
h(Yi ) − h(Zi )
i
≤
i
Pi
log 1 +
2
Ni
X1
If there is no common power constraint, it is clear that the total
capacity is the sum of the capacities of each channel.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
AWGN
Band limited channel
Parallel channels
Common power constraint
The question is: how to distribute the power
among the transmitter to maximize the
capacity?
The capacity for the equivalent channel is:
C=
max [ B1 log(1 +
P1 h12
N0 B1 )
+
B2 log(1 +
P2 h22
N0 B2 )
]
P1 +P2 ≤Px
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
AWGN
Band limited channel
Parallel channels
Common power constraint
So we should maximize C subjected to P1 + P2 ≤ Px . Using
Lagrangian, one can define:
L(P1 , P2 , λ) = B1 log(1+
P2 h22
P1 h12
)+B2 log(1+
)−λ(P1 +P2 −Px )
N0 B1
N0 B2
Let d(.)/dp1 = 0 and d(.)/dp2 = 0 and using ln instead of log2 :
B1
1+
P1 h12
N0 B1
h12
=λ
N0 B1
P1
1
1
=
− 2
B1 N0
λN0 h1
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
AWGN
Band limited channel
Parallel channels
With the same operations we obtain:
P1
B1 N0
P2
B2 N0
1
h12
1
= Cst − 2
h2
= Cst −
Where the Cte can be found by setting P1 + P2 = Px . Since the
two powers are found, the capacity of the channel is calculated
easily. The only constraint that to be considered is that P1 and P2
cannot be negative. If one of these is negative, the corresponding
power is zero and all the power are assigned to the other one. This
principle is called water filling.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
AWGN
Band limited channel
Parallel channels
Exercise
Exercise
Use the same principle (water filling) and give the power allocation
for a channel with three frequency bands defined as follows:
h1 = 1/2, h2 = 1/3 and h3 = 1; B1 = B, B2 = 2B and B3 = B;
N0 B = 1; Px = P1 + P2 + P3 = 10.
Solution: P1 = 3.5, P2 = 0 and P3 = 6.5.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Flat fading channel (frequency non-selective)
A non LOS urban transmission results in general in many of multi
paths : the received signal is the sum of many replicas of
transmitted signal. Using I and Q components of received signal:
r (t) = cos(2πfc t)
I
X
ai cos(φi ) − sin(2πfc t)
i=0
I
X
ai sin(φi ) + n(t)
i=0
P
With the central limit theorem, A = Ii=0 ai cos(φi ) and
P
B = Ii=0 ai sin(φi ) are i.i.d. Gaussian random variables.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
√
The envelope of the received signal h = A2 + B 2 will be Rayleigh
random variable with:
2
h
−h
fh (h) = 2 exp
r ≥0
σ
2σ 2
with σ 2 the variance of A and B. The received power will be an
exponential RV with the pdf:
−p
1
r ≥0
f (p) = 2 exp
2σ
2σ 2
Therefore, the received signal can be modeled as:
Y = hX + N
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Shannon (ergodic) capacity when Rx knows CSI
I
The Channel coefficient h is an i.i.d. random variable
independent of signal and noise.
I
We assume that the receiver knows the channel coefficient but
the transmitter does not.
I
The capacity is: C = maxpx :E[X ] ≤P I (X ; Y , h)
I
Using chain rule:
I (X ; Y , h) = I (X ; h) + I (X ; Y |h) = I (X ; Y |h)
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Conditioned on the fading coefficient h, the channel is transformed
into a simple AWGN with equivalent P equal to |h|2 PX . So we can
write:
|h|2 PX
1
I (X ; Y |h = h) = log 1 +
2
PN
The ergodic capacity of the flat fading channel will be :
1
PX |h|2
C = Eh
log 1 +
2
PN
Note: Normally all the signals are complex and they are the base
band equivalent of reel signals. In this case, the capacity is
multiplied by two since the real and imaginary parts of signals are
decorrelated.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Example(Wireless transmission by Andreas Goldsmith)
Consider a wireless channel where power falloff with distance
follows the formula Pr (d) = Pt (d0 /d)3 for d0 = 10m. Assume the
channel band width of B = 30 kHz and AWGN with noise PSD
N0 /2, where N0 = 10−9 W/Hz. For a transmit power of 1 W flind
the capacity of the channel for a distance of 100m and 1km.
Solution: The received signal to noise ratio SNR is
γ = Pr (d)/PN = pt (d0 /d)3 /(N0 B). That is γ = 15 dB for
d = 100m, and −15 dB for d = 1km. The capacity of complex
transmission is C = B log(1 + SNR) and is 156.6 kbps for
d = 100m and 1.4 kbps for d = 1000 m.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Example(Wireless transmission by Andreas Goldsmith)
√
Consider a flat fading channel with i.i.d. channel gain h, which
can take on three possible values: 0.05 with the probability of 0.1,
0.5 with 0.5, and 1 with 0.4. The transmitted power is 10 mW,
N0 = 10−9 W/Hz, and the channel band width is 30 kHz. Assume
the receiver has the knowledge of instantaneous value of h but the
transmitter does not. Find the Shannon capacity of this channel.
Solution: The channel has three possible received SNRs:
γ1 = Pt h1 /N0 B = 0.83, γ2 = Pt h2 /N0 B = 83.33, and
γ3 = Pt h3 /N0 B = 333.33. So the Shannon capacity is given by:
X
C=
B log2 (1 + γi )p(γi ) = 199.26Kbps
i
Note: The average SNR is 175 and the corresponding capacity
would be 223.8 Kbps.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Capacity with outage
I
Shannon capacity defines the maximum data rate that can be
sent over the channel with asymptotically small error
probability.
I
Since the TX does not know the channel, the transmitted rate
is constant.
I
When channel is in deep fade, the BER is not zero because
the TX cannot adapt its rate relative to CSI.
I
So the capacity with outage is defined and is the maximum
rate that can be achieved with some outage probability (the
probability of deep fading).
I
By allowing some losses in deep fading, higher data rate can
be achieved.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Fixing the required rate, C , a corresponding minimum SNR can be
calculated (assuming complex transmission):
C = log2 (1 + γmin )
If TX sends the date at this rate, the outage (non zero BER)
occurs when γ < γmin . Therefore the probability of outage is
pout = p(γ < γmin ).
The average rate of data that correctly received at RX is
CO = (1 − pout )B log2 (1 + γmin ).
The value of γmin is a design parameter based on the acceptable
outage probability. Normally one draws the normalized capacity
C /B = log2 (1 + γmin ) as a function of pout = p(γ < γmin ).
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Example(Wireless transmission by Andreas Goldsmith)
Consider the same channel as in the last example with BW=30kHz
and p(γ = 0.83) = 0.1, p(γ = 83.33) = 0.5, and
p(γ = 333.33) = 0.4. Find the capacity versus outage and the
average rate correctly received for outage probabilities pout < 0.1,
pout = 0.1 and pout=0.6 .
Solution: For pout < 0.1, we must decode in all the channel
states. Therefore the rate must be less than the worst case:
γmin = γ1 = 0.83. The corresponding capacity is 26.23 Kbps.
For 0.1 ≤ Pout < 0.6, we can decode incorrectly only if the channel
is in the weakest state: γ = 0.83. So γmin = γ2 with corresponding
capacity of 191.94 Kbps.
For 0.6 ≤ Pout < 1, we can decode incorrectly if received γ is γ1 or
γ2 . Thus, γmin = γ3 with corresponding capacity of 251.55 Kbps.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Example (cont.)
For pout < 0.1 data rates close to 26.23 Kbps are always correctly
received.
For pout = 0.1 we transmit at the rate 191.94 but can only corecte
when γ = γ2 or γ3 . So the rate correctly received is
(1-0.1)191.94=172.75 Kbps.
For pout = 0.6 the rate correctly received is (1-0.6)251.55=125.78
Kbps.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Since the channel is known at the TX, the outage cannot be
produced. That is because the TX can adapt its power to avoid
the outage. The capacity is (which is the same as Shannon
capacity as before):
∞
Z
C=
B log2 (1 + γ)p(γ)dγ
0
Now we add also the power adaptation with a power constraint:
Z
∞
P(γ)p(γ)dγ ≤ P̄
0
So the problem is how to distribute the available power as a
function of SNR to maximize the rate while the average power
dose not exceed a predefined value.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Water-filling
The capacity is then
Z
C = maxP(γ):
R
P(γ)p(γ)dγ=P̄
∞
B log2
0
P(γ)γ
1+
P̄
p(γ)dγ
2
Note that γ = P|h|
N0 B . It means that for each channel level
realization, a coding is employed to adjust the rate To find the
optimal power allocation P(γ) we form the Lagrangian.
Z ∞
Z ∞
P(γ)γ
P(γ)p(γ)dγ
J(P(γ)) =
B log2 1 +
p(γ)dγ − λ
P̄
0
0
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Water-filling
Setting the derivative with respect to P(γ) equal to zero and
solving for P(γ) with the constraint P(γ) ≥ 0:
(
1/γ0 − 1/γ γ ≥ γ0
P(γ)
=
P̄
0
γ < γ0
It means that if γ is under a threshold γ0 , the channel will not be
used. The capacity formula is then:
Z ∞
γ
C=
B log2
p(γ)dγ
γ0
γ0
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Water-filling
Therefore, the capacity can be achieved by adapting the rate as a
function of SNR. Another strategy would be fixing the rate and
adapting only the power.
Note that γ0 must be found numerically.
Replacing the optimal power allocation calculated in the constraint
power, we obtain the following expression that should be satisfied
to calculate γ0 .
Z ∞
1
1
−
p(γ)dγ = 1
γ0 γ
γ0
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Water-filling
1/γ0
P(γ)
P
03 03 03 03 03 03 03
C6 C1 B5 C1 C6 C6 C1
1/γ
γ
γ0
Figure above shows why this principle
is called “water-filling”.
1F
62
Vahid Meghdadi
03
B2
03
B2
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Example
With the same example as before: p(γ1 = 0.83) = 0.1,
p(γ1 = 83.33) = 0.5, and p(γ3 = 333.33) = 0.4. Find the ergodic
capacity of the channel with CSI at TX and RX.
Solution: Since water-filling will be used, we must first calculate
γ0 satisfying:
X 1
1
−
p(γi ) = 1
γ0 γi
γi ≥γ0
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity
Outline
Differential entropy
Capacity of Gaussian channel
Capacity of fading channel
Flat fading channel
Shannon Capacity of fading Channel
Capacity with outage
CSI known at TX
Example (cont.)
First we assume that all channel states will be used. In the above
equation everything is known except for γ0 which is calculated to
be 0.884. Since this value exceeds γ1 = 0.83, the first channel
state should not be used.
At the first iteration, the above equation will be calculated only for
the second and third channel giving γ0 = 0.893. This value is
acceptable because the weakest channel is better than this
minimum threshold. Using this values the channel capacity can be
calculated and is 200.82 Kbps.
Vahid Meghdadi
Chapter 4: Continuous channel and its capacity