Performance of Conjugate and Zero-Forcing

172
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 2, FEBRUARY 2013
Performance of Conjugate and Zero-Forcing
Beamforming in Large-Scale Antenna Systems
Hong Yang and Thomas L. Marzetta
Abstract—Large-Scale Antenna Systems (LSAS) is a form of
multi-user MIMO technology in which unprecedented numbers
of antennas serve a significantly smaller number of autonomous
terminals. We compare the two most prominent linear precoders, conjugate beamforming and zero-forcing, with respect
to net spectral-efficiency and radiated energy-efficiency in a
simplified single-cell scenario where propagation is governed by
independent Rayleigh fading, and where channel-state information (CSI) acquisition and data transmission are both performed
during a short coherence interval. An effective-noise analysis
of the pre-coded forward channel yields explicit lower bounds
on net capacity which account for CSI acquisition overhead
and errors as well as the sub-optimality of the pre-coders. In
turn the bounds generate trade-off curves between radiated
energy-efficiency and net spectral-efficiency. For high spectralefficiency and low energy-efficiency zero-forcing outperforms
conjugate beamforming, while at low spectral-efficiency and
high energy-efficiency the opposite holds. Surprisingly, in an
optimized system, the total LSAS-critical computational burden
of conjugate beamforming may be greater than that of zeroforcing. Conjugate beamforming may still be preferable to zeroforcing because of its greater robustness, and because conjugate
beamforming lends itself to a de-centralized architecture and
de-centralized signal processing.
Index Terms—Large-scale antenna system, capacity, energy
efficiency, spectral efficiency, spatial multiplexing, beamforming,
pre-coder, computational burden
I. I NTRODUCTION
L
ARGE-SCALE Antenna Systems (LSAS) bring huge
spectral-efficiency and radiated energy-efficiency gains
compared with 4G wireless technologies [4], [5], [6], [8],
[11]. On the forward link a critical operation is pre-coding, i.e.
the mapping of the message-bearing symbols intended for K
terminals into the M (typically M >> K) signals which the
service-antennas transmit. One advantage of LSAS is that the
preponderance of service-antennas over terminals can make
linear pre-coding perform nearly as well as nonlinear precoding as exemplified by sphere-encoded modulation [1], [7],
or the optimal dirty-paper coding [10]. The contribution of this
paper is to compare, for a simplified but informative scenario,
the relative spectral-efficiency and radiated energy-efficiency
and the computational burden of the two most prominent
forms of linear pre-coding: conjugate beamforming and zeroforcing.
Manuscript received January 16, 2012; revised May 31, 2012 and September 5, 2012. As this paper was co-authored by a guest editor of this issue,
the review of this manuscript was coordinated by Senior Editor Wayne Stark.
H. Yang and T. L. Marzetta are with Mathematics of Networks and
Communications Research Department, Bell Laborataries, Alcatel-Lucent,
Murray Hill, NJ 07974 USA (e-mail: [email protected] and
[email protected]).
Digital Object Identifier 10.1109/JSAC.2013.130206.
Linear pre-coding involves multiplying the K dimensional
vector of QAM symbols intended for the terminals by an
M by K pre-coding matrix to obtain the M dimensional
vector of signals that are actually transmitted. For conjugate
beamforming, the pre-coding matrix is proportional to the
conjugate of the estimated channel matrix, and for zero-forcing
the pre-coding matrix is proportional to the pseudo-inverse of
the estimated channel matrix. In the time-domain, conjugate
beamforming is called “time-reversal beamforming” and it
is equivalent to convolving each QAM sequence with its
respective conjugated and time-reversed impulse-response estimate, and summing over the K correlations. The reverse link
counterpart to conjugate beamforming is matched-filtering.
The scenario is a single cell containing an array of M
service-antennas and K autonomous single-antenna terminals.
The propagation comprises independent Rayleigh fading, and
a-priori nobody knows the instantiation of the channel. During
a specified coherence interval the terminals transmit orthogonal pilot sequences on the reverse link, each service antenna
estimates the reverse link channels to the terminals which, via
time-division duplex (TDD) reciprocity, are also estimates for
the forward-link channels, and during the remaining portion
of the coherence interval message-bearing symbols are transmitted by the service-antennas through a linear pre-coding
operation.
The signal that each terminal receives is interpreted as
the desired message-bearing signal, times a known gain, plus
uncorrelated additive effective noise. This provides rigorous
capacity lower-bounds which account for the noisy channel
estimates and the sub-optimality of the linear pre-coding. We
evaluate energy-efficiency relative to that of a conventional
single-antenna, forward-pilot scheme. In turn we obtain plots
of relative energy-efficiency versus net spectral efficiency,
such that for each point on the curve, the design parameters
(radiated power, number of terminals served, and duration
of reverse link pilots) are chosen optimally. For the two
linear pre-coding schemes we evaluate the computational
burden with respect to the LSAS-critical signal processing
steps, including fast Fourier transforms (FFT’s), correlation
of received pilot signals with pilot sequences, computing the
pre-coding matrix (for zero-forcing), and implementing the
pre-coding.
We find that for an operating point which yields high
spectral-efficiency and low energy-efficiency, zero-forcing outperforms conjugate beamforming, while the converse holds for
high energy-efficiency and low spectral-efficiency. When the
operating point is set such that the relative energy-efficiency
is equal to 1/M we typically obtain very large spectral-
c 2013 IEEE
0733-8716/13/$31.00 YANG and MARZETTA: PERFORMANCE OF CONJUGATE AND ZERO-FORCING BEAMFORMING IN LARGE-SCALE ANTENNA SYSTEMS
efficiency improvements over the reference scenario, and the
spectral-efficiency of conjugate beamforming is not significantly inferior to that of zero-forcing. The extra computations
of computing the pseudo-inverse for zero-forcing typically
do not dominate the total computational burden. In fact, for
an optimized LSAS system the total computational burden
of conjugate beamforming may exceed that of zero-forcing
because of the larger number of terminals that conjugate
beamforming typically serves.
For LSAS reverse link performance, we note that [2]
assesses the impact of the number of antennas on the matched
filter (MF) and minimum mean square error (MMSE) user
detections, while [6] derives qualitatively and quantitatively
different capacity lower bounds from the current paper, and
investigates the interplay between the number of service
antennas and the radiated power of the terminals.
The paper is organized as follows. Notations are introduced
in the next section. The LSAS model is formulated in Section
III. Explicit capacity lower bounds for conjugate beamforming
and zero-forcing pre-coders are presented in Sections IV. In
Section V we discuss the implications of our results to LSAS.
Numerical examples are given in Section VI. Computational
burden of implementing the conjugate beamforming and zeroforcing pre-coders are compared and the implications are
discussed in Section VII. In Section VIII we offer conclusions.
II. N OTATION
For easy reference and to avoid repetition, we list all the
mathematical notation and symbols in this section.
*: This superscript denotes complex conjugate transpose.
: This superscript denotes the optimal value.
ˆ : A hat above an English letter denotes an estimate.
˜ : A tilde above an English letter denotes the estimate error.
ρf : Normalized forward link power so that it is proportional to
the radiated power of the base station divided by the variance
of the noise.
ρr : Normalized reverse link power so that it is proportional
to the radiated power of the mobile terminal divided by the
variance of the noise.
τf : Number of symbols in a forward link pilot sequence.
τr : Number of symbols in a reverse link pilots sequence.
qt : The message-bearing QAM symbol, for a narrow-band
link, is indexed by time, and which, for a wide-band orthogonal frequency-division multiplexing (OFDM) link, is indexed
by both time and frequency. In general we shall suppress the
subscript indexing.
CN (0, 1): zero-mean, circularly-symmetric, unit-variance,
complex Gaussian distributions.
M : number of service antennas in the LSAS base station.
K: number of mobile receivers.
T : number of symbols in a slot over which the channel is
constant.
H: K by M channel response matrix with independent
CN (0, 1) entries.
hk : k-th row vector of H, k = 1, · · · , K.
h: CN (0, 1) channel response.
V : K by M channel estimate noise matrix with independent
CN (0, 1) entries.
173
w : K by 1 data channel noise vector with independent
CN (0, 1) entries.
w : CN (0, 1) data channel noise.
IK , IM : K-dimensional and M -dimensional identity matrices.
E(·): Mathematical expectation of a random variable.
We shall denote an LSAS as a quintuple (M, K, ρf , τr , ρr )
for notation convenience.
III. LSAS S CENARIO AND P ERFORMANCE M ETRICS
Our LSAS scenario, and the associated reference conventional forward pilot based system for defining the relative
performance metrics for the LSAS, are the same as that set
up in [11] for the conjugate beamforming pre-coder. We shall
only briefly recall here.
An M -antenna base station serves K single-antenna terminals with time-division duplex (TDD) operation. During a
slot (consisting of T symbols) the terminals first transmit τr
reverse link orthogonal pilot symbols from which the base
station estimates the propagation matrix. During the remainder
of the slot (consisting of T − τr symbols) the base station
transmits data to the terminals utilizing either a conjugatetranspose linear pre-coder or a zero-forcing linear pre-coder.
The propagation consists of independent Rayleigh fading. A
distinguishing featureof independent Rayleigh
fast fading
∗
is that ||hk h∗l ||E = E{hk h∗l (hk h∗l ) } = δk,l M 2 + M ,
where δk,l is the Kronecker delta, i.e., the expectation induced
norm of the correlation of a channel vector hk with itself
grows as M , while the expectation induced norm of the correlation
√ between channel vectors to different terminals grows
as M . As pointed out in [5], substantially the same is experienced if, for example, there is line-of-sight propagation and at
least half-wavelength spacing between antennas is maintained.
In particular, suppose that we have a linear array with 1/2
wavelength spacing, and that the terminals are distributed
randomly in sine-angle space, then the correlation
between
√
any two distinct channel vectors grows as M . Physically as
the antenna array grows, the angular spacing between any two
terminals is greater than the angular resolution of the array, so
multiplexing is feasible. Constraining inter-element spacing to
be at least 1/2 wavelength implies that the physical size of
the array grows with M , but also that mutual coupling does
not grow with M . An implication of this is that LSAS is not a
direct replacement for macro-cellular tower-top installations.
Entirely new deployment scenarios are needed. One example
is large billboard-size arrays for rural wireless fixed access.
For independent Rayleigh fading, propagation between the
M base station antennas and the K terminals is described by
a K × M matrix, H, where the entries of H are independent
CN (0, 1) random variables. The K terminals transmit orthogonal pilot sequences, each consists of τr symbols. The base
station correlates the received pilot signals with the conjugates
of the respective pilot sequences, which yields the following
K × M processed signal:
√
(1)
Y = τr ρr H + V,
where V is a K × M noise-matrix whose entries are independent, CN (0, 1), and ρr is a measure of the expected SNR
174
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 2, FEBRUARY 2013
of the reverse link channel. The minimum mean-square error
(MMSE) estimate for H is
√
τr ρr
Ĥ =
Y.
(2)
1 + τr ρr
The forward link data channel is given by
√
(3)
x = ρf Hs + w
where x is the K × 1 vector of signals received by the K
mobile terminals, s is the M × 1 input to the base station’s
antennas, and w is the K ×1 noise vector. Here we assume that
the entries of w are independent CN (0, 1) random variables.
The input signal is constrained to have total expected power
of one,
(4)
E {s∗ s} = 1.
Hence, if all of the power were fed into one antenna, the
expected SNR of the received signal would be equal to ρf .
A linear pre-coder is an M × K matrix Λ such that
s = Λq
where q is the K ×1 vector of message-bearing QAM symbols
intended for the K mobile terminals. In the following we
assume that the QAM sequences are uncorrelated with unit
variance,
E qq ∗ = IK .
(5)
The reference system is a conventional single-antenna system with forward link pilots. Its capacity lower bound is
calculated in [11] as
⎧
⎡
⎤⎫
⎨
⎬
2
ρf |ĥ|
⎦
Cconv ≥ E log2 ⎣1 +
ρ
⎩
⎭
1 + 1+τff ρf
1
1
1
· e γ · E1
=
,
(6)
ln 2
γ
∞ −xt
where E1 (x) = 1 e t dt is the exponential integral, and
τf ρ2
γ = 1+(1+τff )ρf is the expected effective SINR. For a given
value of ρf the optimum forward pilot training sequence
length
τf is found by maximizing the spectral efficiency
τ 1 − Tf · Cconv . Here we assume that the channel is constant
over T symbols and τf symbols are used for forward link
pilots in the conventional system.
For LSAS, as discussed in [11], each terminal is assigned
an orthogonal pilot sequence of duration τr symbols. The
maximum number of terminals that can be simultaneously
serviced is limited by the constraint
K ≤ τr ≤ T.
Thus for the slot consists of T symbols, τr symbols are for
reverse link pilots, and the remaining T − τr symbols are for
forward link data.
We define the spectral efficiency of the LSAS as (1 −
τr /T )CLSAS , where CLSAS is the capacity of the LSAS, and
the relative energy efficiency [(bits/Joule)/(bits/Joule)] with
respect to the energy efficiency of the reference system as
[(T − τr )CLSAS ] / [(T − τr )ρf,LSAS ]
ηrel =
[(T − τf )Cconv ] / [T ρf,conv]
CLSAS /ρf,LSAS
=
,
(7)
(1 − τf /T )Cconv /ρf,conv
where the factor (T − τr ) for ρf,LSAS and the factor T
for ρf,conv in the expression reflect the fact that forward
link transmission occupies only part of the full slot T in
the LSAS system since it does not transmit during reverse
link pilot transmission, but occupies the full slot T in the
conventional reference system for transmitting forward link
pilot and forward link data. Note that what we are comparing
for the two systems is the total number of bits transmitted
divided by the total forward link transmitted energy. An
advantage of computing relative energy-efficiency is that no
link-budget definition is required.
IV. C APACITY L OWER B OUNDS FOR C ONJUGATE AND
Z ERO - FORCING P RE - CODERS
In this section, we shall present capacity lower bounds for
the LSAS described in section III, with conjugate beamforming and with zero-forcing beamforming. We take a Bayesian
approach to obtaining capacity lower bounds. The terminal
knows neither the channels nor the channel estimates, but it
knows their statistics. In particular the terminal knows the
mean of the beamforming gain with which it receives its
intended QAM symbol. We re-write the expression for the
terminal’s received signal as a mean gain times the desired
QAM symbol, plus a composite term that constitutes effective
noise. The effective noise is uncorrelated with the desired
QAM symbol. Hence the overall channel is an additive-noise
channel with gain that is known to both the transmitter and the
receiver. The effective noise is non-Gaussian and its entropy
is upper-bounded by the entropy of Gaussian noise having the
same variance. Hence we can conveniently lower-bound the
capacity by letting the effective-noise be Gaussian, and the
expression is the classical log2 (1 + SNR) expression.
A capacity lower bound for conjugate beamforming has
been derived in [11], which we summarized in the following
Theorem 4.1: For an LSAS (M, K, ρf , τr , ρr ) with conjugate beamforming pre-coder, the effective SINR grows linearly
with the number of antennas M . Furthermore, its total capacity
ρf τr ρr
M
Csum,cj ≥ K log2 1 +
K (ρf + 1)(τr ρr + 1)
For conjugate beamforming, the effective SINR at each terminal is [11]
SINRcj =
ρf τr ρr
M
.
K (ρf + 1)(τr ρr + 1)
(8)
For Zero-forcing, the linear pre-coder Λ = c2 Ĥ ∗ (Ĥ Ĥ ∗ )−1 ,
where c2 is a constant scalar. Thus
s = c2 Ĥ ∗ (Ĥ Ĥ ∗ )−1 q
(9)
From (3) and (9), we have
−1
√
q+w
x = c2 ρf H Ĥ ∗ Ĥ Ĥ ∗
−1 √
√
∗
∗
= c2 ρf q + w − c2 ρf H̃ Ĥ Ĥ Ĥ
q
Thus the received signal for the k-th terminal is
−1 √
√
∗
∗
xk = c2 ρf qk + wk − c2 ρf h̃k Ĥ Ĥ Ĥ
q
(10)
YANG and MARZETTA: PERFORMANCE OF CONJUGATE AND ZERO-FORCING BEAMFORMING IN LARGE-SCALE ANTENNA SYSTEMS
Note that xk is decomposed into three uncorrelated terms
in the right hand side of equation (10). The first term is the
signal term. The second term represents the uncorrelated noise.
The third term accounts for the channel estimation error. In
comparison to the conjugate beamforming pre-coder [11], the
corresponding equation for conjugate beamforming pre-coder
has additional terms that account for the cross-talk due to nonorthogonal channel vectors and beamforming gain variation.
From (10) we see that the effective SINR for the k-th
terminal equals
SINRzf =
variance(1st term)
variance(2nd term) + variance(3rd term)
which can be evaluated as
SINRzf =
c22 ρf
c2 ρf (1 + τr ρr )
= 2
ρf
1 + τr ρr + ρf
1 + 1+τr ρr
(11)
by calculating each term individually. Note that in order to
obtain an explicit expression for the effective SINR, we need
to compute the constant scalar c2 . This can be carried out as
follows. With s given in equation (9), normalizing the power
to conform to the constraint (4) we can write
−1 ∗
c2 = E Trace Ĥ Ĥ
Since Ĥ =
τr ρr
1+τr ρr Z,
where Z ∼ CN (0, 1)K×M , we have
−1
−1 τ ρ
1 + τr ρr
r r
Ĥ Ĥ ∗
=
ZZ ∗
=
(ZZ ∗ )−1
1 + τr ρr
τr ρr
where ZZ ∗ is a central complex Wishart matrix. From [9],
we see
K
−1
=
E Trace (ZZ ∗ )
M −K
175
V. I MPLICATIONS
The simplicity of the capacity lower bounds in Theorem 4.1
and Theorem 4.2 allows us to arrive at a number of important
implications for LSAS. We shall summarize these implications
as corollaries in the following subsections. As mentioned
in section IV, we take a Bayesian approach to obtaining
capacity lower bounds, if one implemented a soft decoder,
then a similar Bayesian approach would be used to obtain
an expression for the a-posteriori probability density, and one
would typically approximate the probability density of the
effective noise by a Gaussian density. Since the effective noise
comprises a sum of many terms, the central limit theorem
guarantees that this is a good approximation. Consequently
our capacity lower-bound expressions are also fairly good
predictors of the performance of a soft decoder. Furthermore,
the approach to arriving at the capacity lower bounds is
consistent for both pre-coders, the tightness of the bound
should be similar. Without doing a full-blown simulation,
the current approach provides a reasonably credible way to
theoretically assess the performances of the pre-coders.
Let
ρf τr ρr
M
(14)
Bsum,cj = K log2 1 +
K (ρf + 1)(τr ρr + 1)
ρf τr ρr
M −K
(15)
Bsum,zf = K log2 1 +
K ρf + 1 + τr ρr
A. Asymptotic Performance
For the asymptotic performance, we have
Corollary 5.1: As the number of transmit antennas M increases without bound, the capacity performance of conjugate
beamforming pre-coder and zero-forcing beamforming precoder are asymptotically identical, i.e.,
Bsum,cj
=1
M→∞ Bsum,zf
lim
which implies
c2 =
τr ρr
M −K
·
1 + τr ρr
K
(12)
Finally, combining equations (11) and (12) we obtain
SINRzf =
ρf τr ρr
M −K
·
K
ρf + 1 + τr ρr
In fact, one can easily obtain the following interesting
lemma
Lemma 5.2: For any constants a1 , a2 , b1 > 0, b2 > 0,
lim
(13)
and a capacity lower bound for the aggregate of K terminals
ρf τr ρr
M −K
Csum, zf ≥ K log2 1 +
K ρf + 1 + τr ρr
We summarize the above results in the following
Theorem 4.2: For an LSAS (M, K, ρf , τr , ρr ) with zeroforcing beamforming pre-coder, the effective SINR grows
linearly with the number of transmit antennas minus the
number of served terminals, M − K. Furthermore, its total
capacity
ρf τr ρr
M −K
Csum, zf ≥ K log2 1 +
K ρf + 1 + τr ρr
We shall discuss the implications of Theorem 4.1 and Theorem
4.2 in LSAS performance in the following section.
x→∞
ln(a1 + b1 x)
=1
ln(a2 + b2 x)
Corollary 5.1 follows immediately from (14), (15) and Lemma
5.2.
It is interesting to note that
lim
M→∞
SINRzf
ρf τr ρr
=1+
>1
SINRcj
1 + ρf + τr ρr
Remark Note that the number of LSAS antennas M can grow
independently of ρf , ρr , τr and K, from equations (8) and
(13), we see that as M grows without bound, both SINRzf
and SINRcj can be made arbitrarily large.
B. General Performance Comparison
Comparing the capacity lower bounds of the conjugate
beamforming and zero-forcing beamforming pre-coders, we
have
176
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 2, FEBRUARY 2013
Corollary 5.3: For a given number of transmit antennas M
and a given number of served terminals K < M , Bsum,zf ≥
Bsum,cj if and only if SINRzf ≥ 1.
In fact, from equations (8) and (13) we see
SINRzf
K
(SINRzf − 1) .
=1+
SINRcj
M
Thus if SINRzf ≥ 1, then SINRzf ≥ SINRcj , which implies
Bsum,zf ≥ Bsum,cj . Conversely, if SINRzf < 1, then SINRzf <
SINRcj , which implies Bsum,zf < Bsum,cj .
Since SINR is an increasing function of forward link SNR
ρf , based on Corollary 5.3 we expect to see zero-forcing
beamforming pre-coder outperforming conjugate beamforming pre-coder in the high SNR region while conjugate beamforming pre-coder outperforming zero-forcing beamforming
pre-coder in the low SNR region. This is indeed the case
as it is shown in the examples in Section VI. In the high
SNR region, zero-forcing beamforming performs much better
than conjugate beamforming. In low SNR region, the two precoders perform similarly, with conjugate beamforming slightly
outperforms zero-forcing beamforming.
C. Optimal Number of Spatial Multiplexing Terminals
Given the same number of LSAS antennas M , the same
forward link SNR ρf and the same reverse link SNR ρr for
the two pre-coders, their spectral efficiencies, represented by
(1 − τr /T )Bsum,cj and (1 − τr /T )Bsum,zf , can be maximized
over the number of spatial multiplexing terminals K and the
number of symbols used for reverse link pilots τr simultaneously. Let Kcj and Kzf be the corresponding optimal number
of spatial multiplexing terminals, we shall give a comparison
of Kcj and Kzf in this subsection. The following corollary
gives a necessary condition for a pair (K, τr ) to be optimal in
the conjugate beamforming case. It states that for conjugate
beamforming pre-coder, the optimal number of terminals Kcj
must equal to the number of reverse link pilots used for
channel state estimation.
Corollary 5.4: For conjugate beamforming pre-coder, if
(Kcj , τr ) maximizes the spectral efficiency (1 − τr /T )Bsum,cj ,
then Kcj = τr .
Corollary 5.4 is obtained by observing that the factor (1 −
τr /T ) is independent of the number of served terminals K
and that Bsum,cj is an increasing function of K, and that K is
subject to the constraint K ≤ τr . Recall from Section III that
for a channel that is constant over T symbols, it is necessary
that τr ≤ T . Thus τr cannot grow without bound. In fact,
τr = T implies that all the resources are used for channel
estimation, leaving no resources for data transmission.
We remark that for zero-forcing beamforming pre-coder, the
Corollary 5.4 remains valid in most cases but does not hold
true in general since Bsum,zf may not be increasing with respect
to the number of served terminal K when ρf is small enough.
In general, in the high SNR region, i.e., when ρf >> 1,
we have SINRzf > SINRcj . In this case the optimal number
of terminals for the LSAS with zero-forcing beamforming
pre-coder to achieve maximum capacity is larger than that
for the LSAS with conjugate beamforming pre-coder. The
opposite is true in the low SNR region since SINRzf < SINRcj
when ρf << 1. As the number of transmit antennas M
increases and the forward SNR ρf decreases toward zero, we
have SINRzf ≈ SINRcj . In this case the optimal number
of terminals for the two beamforming pre-coders becomes
comparable.
D. Maximal Spatial Multiplexing Gain
A prominent advantage of the LSAS is the substantial
capacity gain due to spatial multiplexing. The maximal spatial
multiplexing gain is defined, for a given ρf , as the ratio of the
maximal spectral efficiency, which is achieved when there are
optimal number of terminals in the system, and the spectral
efficiency when there is only one terminal is the system. As
mentioned in the previous subsection, in the high SNR region,
i.e., when ρf >> 1, we have SINRzf > SINRcj . In this case
the maximal spatial multiplexing gain is larger for the LSAS
with zero-forcing beamforming pre-coder. The opposite is true
in the low SNR region since SINRzf < SINRcj when ρf << 1.
As the number of transmit antennas M increases and the
forward SNR ρf decreases toward zero, we have SINRzf ≈
SINRcj . In this case the maximal spatial multiplexing gains
for the two beamforming pre-coders becomes comparable. We
note that in general, more users do not necessary translate into
higher multiplexing gain. This is can be seen in the examples
in Section VI (Figures 3 and 4).
E. Single Mobile Terminal Case
A deeper understanding of the difference between conjugate pre-coder and zero-forcing pre-coder can be gained by
considering the case when there is only one terminal in the
system, i.e., when K = 1. In this case the conjugate and
zero-forcing beamforming vectors are the same, except for
their scalar multipliers. However, zero-forcing beamforming
pre-coder performs better except for the very low SNR (e.g.,
SNR = -20dB in our examples in the next section) case,
for which the performance are comparable between the two
beamforming pre-coders. The reason is as follows. The corresponding equation (10) for conjugate beamforming contains
two additional “interference terms” [11]:
(a) cross interference from other served mobiles
(b) variance of the beamforming gain
Zero-forcing beamforming forces these two terms to zero.
Thus these two terms are absent in equation (10). For K = 1,
there is no cross interference, (a) is also zero for conjugate
beamforming; (b) is the reason for the different performances
between conjugate beamforming and zero-forcing pre-coders.
On the one hand, due to different power normalization of
the two pre-coders, the beamforming gain for zero-forcing
is smaller by a factor (M − K)/M . On the other hand, the
variance of zero-forcing beamforming gain is zero while the
variance of conjugate beamforming gain [11] is
1 ρf τr ρr
K τr ρr + 1
Thus for large enough ρf , the advantage of having zero
variance in beamforming gain more than compensates the
beamforming gain reduction factor (M − K)/M for zeroforcing, which can make zero-forcing beamforming perform
better. But for very small ρf , the variance of conjugate
YANG and MARZETTA: PERFORMANCE OF CONJUGATE AND ZERO-FORCING BEAMFORMING IN LARGE-SCALE ANTENNA SYSTEMS
Fig. 1.
Trade-off between relative energy-efficiency and net spectralefficiency for conjugate and zero-forcing beamforming (M = 100)
177
Fig. 2.
Trade-off between relative energy-efficiency and net spectralefficiency for conjugate and zero-forcing beamforming (M = 400)
beamforming gain is close to zero, having zero variance
in beamforming gain does not gain much for zero-forcing
precoder, and the reduction factor (M − K)/M for zeroforcing beamforming can actually make it perform worse than
conjugate beamforming.
VI. E XAMPLES
Using the capacity bounds obtained above, we construct two
examples which illustrate the performance of conjugate beamforming and zero-forcing beamforming pre-coders through
trade-off curves that constitute relative energy-efficiency expressed in (bits/Joule)/(bits/Joule) versus net spectral efficiency (bits/second/Hz). The first example (Figure 1) is for
the number of base station transmit antennas M = 100 and
the second example (Figure 2) is for M = 400.
The reverse link expected SNR is ρr = 1.0 (0 dB) and
the equivalent slot-duration is T = 196 symbols. The choice
of T = 196 = 14 × 14 is based on typical LTE OFDM
parameters and a very short coherence interval comprising 14
OFDM symbols occupying a total of 1 millisecond where it is
assumed that the channels are substantially constant over 14
consecutive 15 KHz subcarriers. Time-frequency pilots have
to be transmitted within each 14-tone sub-band, and there is
room for at most 196 orthogonal pilot sequences.
The reference scenario is a conventional single-antenna
system with forward link pilot and the expected forward link
SNR is assumed to be ρf = 10.0 (10 dB). The spectral
efficiency of the reference system is calculated to be 2.6474
bits/second/Hz, which is achieve when τf = 9 [11].
On each curve in Figures 1, 2, 3 and 4, each marker from
left to right corresponds to the total forward power level ρf
equals 0.01, 0.1, 1, 10, 100 and 1000 respectively, where a
power level of 10 implies that the total radiated power is
identical to that in the reference scenario. To achieve 1000 fold
relative energy efficiency, with number of transmit antennas M
= 400, we see that for an LSAS with conjugate beamforming
pre-coder, the required ρf = 0.276 and the corresponding
maximal spectral efficiency of 53.35 bits/s/Hz is achieved with
Fig. 3.
Optimal number of terminals K vs. total radiated power
53 spatial multiplexing users. For an LSAS with zero-forcing
beamforming pre-coder, the required ρf = 0.303 and the
corresponding maximal spectral efficiency of 60.23 bits/s/Hz
is achieved with 49 spatial multiplexing users.
Figures 3 shows the optimal number of terminals for each
beamforming pre-coder to achieve maximal spectral efficiency
at each forward power level for M = 100 and M = 400.
As expected, in the high SNR region, the optimal number
of terminals for the LSAS with zero-forcing beamforming
pre-coder is larger than that for the LSAS with conjugate
beamforming pre-coder. The opposite is true in the low SNR
region.
Figures 4 shows the maximum spatial multiplexing gain for
each beamforming pre-coder at each forward power level for
M = 100 and M = 400. As expected, the spatial multiplexing
gain is larger for the LSAS with zero-forcing beamforming
pre-coder in the high SNR region while the opposite is true
in the low SNR region.
178
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 2, FEBRUARY 2013
TABLE I
C OMPUTATION BURDEN COMPARISON FOR M = 100 AND ρf = 1
M = 100, ρf = 1
Bandwidth (MHz)
20
10
Conjugate Total (Gflops)
350.0
171.3
Zero-forcing Total (Gflops)
307.1
149.8
5
83.8
73.0
K
46
30
τr
46
30
TABLE II
C OMPUTATION BURDEN COMPARISON FOR M = 400 AND ρf = 1
M = 400, ρf = 1
Bandwidth (MHz)
20
10
Conjugate Total (Gflops)
411.1
201.8
Zero-forcing Total (Gflops)
562.3
277.4
Fig. 4.
Spatial multiplexing gain vs. total radiated power
•
VII. C OMPUTATIONAL B URDEN OF C ONJUGATE AND
Z ERO - FORCING P RE - CODERS
In addition to the ”over-the-air” capacity and radiated
energy efficiency performance studied in the previous sections,
another important energy efficiency performance indicator
for a pre-coder is its internal power consumption. Some
internal activities have a power consumption that is more
or less independent of M and K. Others, such as LSAScritical computations are strongly dependent on M and K.
The internal power consumption obviously depends on the
computation burden, measured in Gflops (giga floating point
operations per second), of the pre-coders, and the energyefficiency of computing measured in Gflops/Watt.
In this section, we shall compare the computation burden
of the conjugate beamforming pre-coder and zero-forcing precoder. For this purpose we introduce the following mathematical symbols, in addition to the notation listed in Section II
Tsl : slot time, i.e., the time duration a slot occupies.
B: Carrier bandwidth
Ts : symbol duration
Tu : usable symbol duration
Td : OFDM guard time,
where B is measured in Hz, and the times are measured in
seconds. Note that with the definitions above, the slot symbolduration is T = TTsls · TTud , where TTsls is the number of OFDM
symbols in the slot, and under the assumption that the channel
delay-spread is equal to the guard interval, TTud is equal to
the Nyquist-sampling frequency interval in units of tones.
With OFDM modulation, within a slot time Tsl , the following
LSAS-critical computations need to be carried out for every
beamforming pre-coder. Counting in terms of floating point
operations, we have
• FFT’s/IFFT’s:
Tsl
M·
· (Tu B) · log2 (Tu B)
Ts
•
here TTsls is the number of symbols in Tsl and Tu B is the
number of subcarriers in the total bandwidth B.
Correlation of pilot signals with pilot sequences:
(a) Direct correlation: M · (Td B) · K · τr
5
99.0
136.8
K
63
63
τr
63
63
(b) Fast correlation: M · (Td B) · τr · log2 (τr )
here Td B is the number of constant frequency blocks in
the total bandwidth B.
Implementation of pre-coding/de-coding:
M · K · (T − τr ) · (Td B)
Conjugate beamforming does not require any other computation while zero-forcing beamforming requires an additional
pseudo-inverse, which requires M · K 2 · (Td B) floating point
operations.
Assuming that the forward SNR ρf = 1, with the corresponding optimal number of terminals K and optimal
number of reverse link pilot τr derived in the previous
section, we summarize the computation burdens for conjugate
beamforming pre-coder and zero-forcing beamforming precoder in Tables I & II.
Table I, which is for M = 100, shows that zero-forcing
requires roughly 12% to 13% less computation than conjugate
beamforming.
Table II, which is for M = 400, shows that zero-forcing requires roughly 37% to 38% more computation than conjugate
beamforming.
The results are surprising in two ways. While it is obvious
that for an LSAS, with the same number of served terminals K, zero-forcing beamforming pre-coder requires more
computations than conjugate beamforming pre-coder, albeit
that the increase is surprisingly small, but at the maximum
achievable capacity, zero-forcing beamforming pre-coder actually requires less computation since it serves fewer terminals
than conjugate beamforming pre-coder.
The above calculations include only pre-coder related computations. We note that the above consideration on the internal
energy efficiency of LSAS is by no means complete. For
example, energy consumed in error-correction coding and
de-coding, A/D and D/A converters, pre-amplifiers, mixers,
oscillators, etc., should also be considered.
VIII. C ONCLUSIONS AND D ISCUSSION
Both conjugate beamforming and zero-forcing are viable
linear pre-coding schemes for LSAS as exemplified by tradeoff curves between energy-efficiency and spectral-efficiency.
Where the highest spectral-efficiency is desired, zero-forcing
can perform appreciably better than conjugate beamforming.
If, however, a more energy-efficient operating point is chosen
YANG and MARZETTA: PERFORMANCE OF CONJUGATE AND ZERO-FORCING BEAMFORMING IN LARGE-SCALE ANTENNA SYSTEMS
then the performance gap narrows, and eventually conjugate
beamforming outperforms zero-forcing.
Our quantification of the computational burden of LSAS
yields some surprises. The extra computations required to
obtain the channel pseudo-inverse for zero-forcing do not add
greatly to the total burden. In the regime of high energyefficiency conjugate beamforming can have a greater total
computational burden than zero-forcing because of the larger
number of terminals served.
Irrespective of computational considerations, conjugate
beamforming may be preferable to zero-forcing because of its
greater robustness, and that it lends itself to a de-centralized
architecture and de-centralized signal processing whereby every service antenna keeps its channel estimates to itself, and
performs its own signal processing independently. Service antennas can be added or deleted without changing the function
of the system; indeed the central unit which is running the
system almost does not have to know the number of service
antennas employed.
Our simplified scenario ignores range-dependent attenuation
as well as shadow-fading, so there are no near-far effects
to contend with. Further investigations that include near-far
effects are warranted (in addition to multi-cell scenarios) but
because near-far effects will tend to increase the conditionnumber of the propagation matrix, it is possible that the
simplified scenario shows zero-forcing in its best light.
ACKNOWLEDGMENT
The authors would like to thank anonymous referees for
their very constructive comments and corrections of errors.
R EFERENCES
[1] B. Hochwald, C. Peel, and A.L. Swindlehurst, “A Vector-perturbation
technique for near-capacity multi-antenna multi-user communication
part II: perturbation”, IEEE Transactions on Communications, March,
2005.
[2] J. Hoydis, S. ten Brink and M. Debbah, “Massive MIMO: How many
antennas do we need?”, arXiv:1107.1709v2 [cs.IT] 23 Sep 2011.
[3] H. Huh, G. Caire, H. C. Papadopoulos and S. A. Ramprashad, “Achieving ‘massive MIMO’ spectral efficiency with a not-so-large number of
antennas”, arXiv:1107.3862v2 [cs.IT] 13 Sep 2011.
[4] T. L. Marzetta, “How much training is required for multiuser MIMO?”,
in Fortieth Asilomar Conference on Signals, Systems and Computers,
Pacific Grove, CA, pp. 359-363, Oct. 2006.
[5] T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station antennas”, IEEE Trans. Wireless Commun., vol. 9,
no. 11, pp. 3590-3600, Nov. 2010.
[6] H. Q. Ngo, E. G. Larsson and T. L. Marzetta, “Energy and Spectral
Efficiency of Very Large Multiuser MIMO Systems”, submitted to IEEE
Transactions on Communications, arXiv:1112.3810v2 [cs.IT] 21 May
2012.
[7] C. Peel, B. Hochwald, and A.L. Swindlehurst, “A Vector-Perturbation
Technique for Near-Capacity Multi-Antenna Multi-User Communication
Part I: Channel Inversion and Regularization”, IEEE Transactions on
Communications, January, 2005.
179
[8] F. Rusek, D. Persson, B. K. Lau, E. G. Larsson, T. L. Marzetta, O. Edfors
and F. Tufvesson, “Scaling up MIMO: Opportunities and challenges with
very large arrays”, IEEE Signal Processing Magazine, to appear 2012
[9] A. M. Tulino and S. Verdu, Random Matrix Theory and Wireless
Communications, Now Publishers Inc., 2004.
[10] S. Vishwanath, N. Jindal, and A. Goldsmith, “Duality, achievable rates,
and sum-rate capacity of Gaussian MIMO broadcast channels”, IEEE
Trans. on Information Theory, Volume 49, Issue 10, pp. 2658-2668, Oct.
2003.
[11] H. Yang and T. L. Marzetta, “Quantized array processing gain in largescale antenna systems”, Bell Labs technical report, September 2011.
Hong Yang is a member of technical staff in
the Alcatel-Lucent Bell Labs Mathematics of Networks and Communications Research Department
in Murray Hill, New Jersey. Prior to that, he was
with Alcatel-Lucent’s Radio Frequency Technology
Systems Engineering Department, also in Murray
Hill. He received his Ph.D. in applied mathematics
from Princeton University, Princeton, New Jersey.
Dr. Yang has worked extensively on LTE/CDMA
wireless network air interface performance, including coverage link budget and voice/data capacity. He
also worked several years developing algorithms for RF design tools. Prior
to his tenure at Alcatel-Lucent, he worked in academia and for a start-up
company. He has published research papers in well known electrical engineering, applied mathematics, and financial economics journals and conference
proceedings. His current research interests include energy efficient wireless
network design optimization and energy-coverage-capacity performance of
Large-Scale Antenna Systems (LSAS).
Thomas L. Marzetta was born in Washington,
D.C. He received the PhD in electrical engineering from the Massachusetts Institute of Technology
in 1978. His dissertation extended, to two dimensions, the three-way equivalence of autocorrelation
sequences, minimum-phase prediction error filters,
and reflection coefficient sequences. He worked for
Schlumberger-Doll Research (1978 - 1987) to modernize geophysical signal processing for petroleum
exploration. He headed a group at Nichols Research
Corporation (1987 - 1995) which improved automatic target recognition, radar signal processing, and video motion detection. He joined Bell Laboratories in 1995 (formerly part of AT&T, then
Lucent Technologies, now Alcatel-Lucent). He has had research supervisory
responsibilities in communication theory, statistics, and signal processing. He
specializes in multiple-antenna wireless, with a particular emphasis on the
acquisition and exploitation of channel-state information. He is the originator
of Large-Scale Antenna Systems which can provide huge improvements in
wireless energy efficiency and spectral efficiency.
Dr. Marzetta was a member of the IEEE Signal Processing Society
Technical Committee on Multidimensional Signal Processing, a member of
the Sensor Array and Multichannel Technical Committee, an associate editor
for the IEEE Transactions on Signal Processing, an associate editor for
the IEEE Transactions on Image Processing, and a guest associate editor
for the IEEE Transactions on Information Theory Special Issue on Signal
Processing Techniques for Space-Time Coded Transmissions (Oct. 2002) and
for the IEEE Transactions on Information Theory Special Issue on SpaceTime Transmission, Reception, Coding, and Signal Design (Oct. 2003).
Dr. Marzetta was the recipient of the 1981 ASSP Paper Award from the
IEEE Signal Processing Society. He was elected a Fellow of the IEEE in Jan.
2003.