172 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 2, FEBRUARY 2013 Performance of Conjugate and Zero-Forcing Beamforming in Large-Scale Antenna Systems Hong Yang and Thomas L. Marzetta Abstract—Large-Scale Antenna Systems (LSAS) is a form of multi-user MIMO technology in which unprecedented numbers of antennas serve a significantly smaller number of autonomous terminals. We compare the two most prominent linear precoders, conjugate beamforming and zero-forcing, with respect to net spectral-efficiency and radiated energy-efficiency in a simplified single-cell scenario where propagation is governed by independent Rayleigh fading, and where channel-state information (CSI) acquisition and data transmission are both performed during a short coherence interval. An effective-noise analysis of the pre-coded forward channel yields explicit lower bounds on net capacity which account for CSI acquisition overhead and errors as well as the sub-optimality of the pre-coders. In turn the bounds generate trade-off curves between radiated energy-efficiency and net spectral-efficiency. For high spectralefficiency and low energy-efficiency zero-forcing outperforms conjugate beamforming, while at low spectral-efficiency and high energy-efficiency the opposite holds. Surprisingly, in an optimized system, the total LSAS-critical computational burden of conjugate beamforming may be greater than that of zeroforcing. Conjugate beamforming may still be preferable to zeroforcing because of its greater robustness, and because conjugate beamforming lends itself to a de-centralized architecture and de-centralized signal processing. Index Terms—Large-scale antenna system, capacity, energy efficiency, spectral efficiency, spatial multiplexing, beamforming, pre-coder, computational burden I. I NTRODUCTION L ARGE-SCALE Antenna Systems (LSAS) bring huge spectral-efficiency and radiated energy-efficiency gains compared with 4G wireless technologies [4], [5], [6], [8], [11]. On the forward link a critical operation is pre-coding, i.e. the mapping of the message-bearing symbols intended for K terminals into the M (typically M >> K) signals which the service-antennas transmit. One advantage of LSAS is that the preponderance of service-antennas over terminals can make linear pre-coding perform nearly as well as nonlinear precoding as exemplified by sphere-encoded modulation [1], [7], or the optimal dirty-paper coding [10]. The contribution of this paper is to compare, for a simplified but informative scenario, the relative spectral-efficiency and radiated energy-efficiency and the computational burden of the two most prominent forms of linear pre-coding: conjugate beamforming and zeroforcing. Manuscript received January 16, 2012; revised May 31, 2012 and September 5, 2012. As this paper was co-authored by a guest editor of this issue, the review of this manuscript was coordinated by Senior Editor Wayne Stark. H. Yang and T. L. Marzetta are with Mathematics of Networks and Communications Research Department, Bell Laborataries, Alcatel-Lucent, Murray Hill, NJ 07974 USA (e-mail: [email protected] and [email protected]). Digital Object Identifier 10.1109/JSAC.2013.130206. Linear pre-coding involves multiplying the K dimensional vector of QAM symbols intended for the terminals by an M by K pre-coding matrix to obtain the M dimensional vector of signals that are actually transmitted. For conjugate beamforming, the pre-coding matrix is proportional to the conjugate of the estimated channel matrix, and for zero-forcing the pre-coding matrix is proportional to the pseudo-inverse of the estimated channel matrix. In the time-domain, conjugate beamforming is called “time-reversal beamforming” and it is equivalent to convolving each QAM sequence with its respective conjugated and time-reversed impulse-response estimate, and summing over the K correlations. The reverse link counterpart to conjugate beamforming is matched-filtering. The scenario is a single cell containing an array of M service-antennas and K autonomous single-antenna terminals. The propagation comprises independent Rayleigh fading, and a-priori nobody knows the instantiation of the channel. During a specified coherence interval the terminals transmit orthogonal pilot sequences on the reverse link, each service antenna estimates the reverse link channels to the terminals which, via time-division duplex (TDD) reciprocity, are also estimates for the forward-link channels, and during the remaining portion of the coherence interval message-bearing symbols are transmitted by the service-antennas through a linear pre-coding operation. The signal that each terminal receives is interpreted as the desired message-bearing signal, times a known gain, plus uncorrelated additive effective noise. This provides rigorous capacity lower-bounds which account for the noisy channel estimates and the sub-optimality of the linear pre-coding. We evaluate energy-efficiency relative to that of a conventional single-antenna, forward-pilot scheme. In turn we obtain plots of relative energy-efficiency versus net spectral efficiency, such that for each point on the curve, the design parameters (radiated power, number of terminals served, and duration of reverse link pilots) are chosen optimally. For the two linear pre-coding schemes we evaluate the computational burden with respect to the LSAS-critical signal processing steps, including fast Fourier transforms (FFT’s), correlation of received pilot signals with pilot sequences, computing the pre-coding matrix (for zero-forcing), and implementing the pre-coding. We find that for an operating point which yields high spectral-efficiency and low energy-efficiency, zero-forcing outperforms conjugate beamforming, while the converse holds for high energy-efficiency and low spectral-efficiency. When the operating point is set such that the relative energy-efficiency is equal to 1/M we typically obtain very large spectral- c 2013 IEEE 0733-8716/13/$31.00 YANG and MARZETTA: PERFORMANCE OF CONJUGATE AND ZERO-FORCING BEAMFORMING IN LARGE-SCALE ANTENNA SYSTEMS efficiency improvements over the reference scenario, and the spectral-efficiency of conjugate beamforming is not significantly inferior to that of zero-forcing. The extra computations of computing the pseudo-inverse for zero-forcing typically do not dominate the total computational burden. In fact, for an optimized LSAS system the total computational burden of conjugate beamforming may exceed that of zero-forcing because of the larger number of terminals that conjugate beamforming typically serves. For LSAS reverse link performance, we note that [2] assesses the impact of the number of antennas on the matched filter (MF) and minimum mean square error (MMSE) user detections, while [6] derives qualitatively and quantitatively different capacity lower bounds from the current paper, and investigates the interplay between the number of service antennas and the radiated power of the terminals. The paper is organized as follows. Notations are introduced in the next section. The LSAS model is formulated in Section III. Explicit capacity lower bounds for conjugate beamforming and zero-forcing pre-coders are presented in Sections IV. In Section V we discuss the implications of our results to LSAS. Numerical examples are given in Section VI. Computational burden of implementing the conjugate beamforming and zeroforcing pre-coders are compared and the implications are discussed in Section VII. In Section VIII we offer conclusions. II. N OTATION For easy reference and to avoid repetition, we list all the mathematical notation and symbols in this section. *: This superscript denotes complex conjugate transpose. : This superscript denotes the optimal value. ˆ : A hat above an English letter denotes an estimate. ˜ : A tilde above an English letter denotes the estimate error. ρf : Normalized forward link power so that it is proportional to the radiated power of the base station divided by the variance of the noise. ρr : Normalized reverse link power so that it is proportional to the radiated power of the mobile terminal divided by the variance of the noise. τf : Number of symbols in a forward link pilot sequence. τr : Number of symbols in a reverse link pilots sequence. qt : The message-bearing QAM symbol, for a narrow-band link, is indexed by time, and which, for a wide-band orthogonal frequency-division multiplexing (OFDM) link, is indexed by both time and frequency. In general we shall suppress the subscript indexing. CN (0, 1): zero-mean, circularly-symmetric, unit-variance, complex Gaussian distributions. M : number of service antennas in the LSAS base station. K: number of mobile receivers. T : number of symbols in a slot over which the channel is constant. H: K by M channel response matrix with independent CN (0, 1) entries. hk : k-th row vector of H, k = 1, · · · , K. h: CN (0, 1) channel response. V : K by M channel estimate noise matrix with independent CN (0, 1) entries. 173 w : K by 1 data channel noise vector with independent CN (0, 1) entries. w : CN (0, 1) data channel noise. IK , IM : K-dimensional and M -dimensional identity matrices. E(·): Mathematical expectation of a random variable. We shall denote an LSAS as a quintuple (M, K, ρf , τr , ρr ) for notation convenience. III. LSAS S CENARIO AND P ERFORMANCE M ETRICS Our LSAS scenario, and the associated reference conventional forward pilot based system for defining the relative performance metrics for the LSAS, are the same as that set up in [11] for the conjugate beamforming pre-coder. We shall only briefly recall here. An M -antenna base station serves K single-antenna terminals with time-division duplex (TDD) operation. During a slot (consisting of T symbols) the terminals first transmit τr reverse link orthogonal pilot symbols from which the base station estimates the propagation matrix. During the remainder of the slot (consisting of T − τr symbols) the base station transmits data to the terminals utilizing either a conjugatetranspose linear pre-coder or a zero-forcing linear pre-coder. The propagation consists of independent Rayleigh fading. A distinguishing featureof independent Rayleigh fast fading ∗ is that ||hk h∗l ||E = E{hk h∗l (hk h∗l ) } = δk,l M 2 + M , where δk,l is the Kronecker delta, i.e., the expectation induced norm of the correlation of a channel vector hk with itself grows as M , while the expectation induced norm of the correlation √ between channel vectors to different terminals grows as M . As pointed out in [5], substantially the same is experienced if, for example, there is line-of-sight propagation and at least half-wavelength spacing between antennas is maintained. In particular, suppose that we have a linear array with 1/2 wavelength spacing, and that the terminals are distributed randomly in sine-angle space, then the correlation between √ any two distinct channel vectors grows as M . Physically as the antenna array grows, the angular spacing between any two terminals is greater than the angular resolution of the array, so multiplexing is feasible. Constraining inter-element spacing to be at least 1/2 wavelength implies that the physical size of the array grows with M , but also that mutual coupling does not grow with M . An implication of this is that LSAS is not a direct replacement for macro-cellular tower-top installations. Entirely new deployment scenarios are needed. One example is large billboard-size arrays for rural wireless fixed access. For independent Rayleigh fading, propagation between the M base station antennas and the K terminals is described by a K × M matrix, H, where the entries of H are independent CN (0, 1) random variables. The K terminals transmit orthogonal pilot sequences, each consists of τr symbols. The base station correlates the received pilot signals with the conjugates of the respective pilot sequences, which yields the following K × M processed signal: √ (1) Y = τr ρr H + V, where V is a K × M noise-matrix whose entries are independent, CN (0, 1), and ρr is a measure of the expected SNR 174 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 2, FEBRUARY 2013 of the reverse link channel. The minimum mean-square error (MMSE) estimate for H is √ τr ρr Ĥ = Y. (2) 1 + τr ρr The forward link data channel is given by √ (3) x = ρf Hs + w where x is the K × 1 vector of signals received by the K mobile terminals, s is the M × 1 input to the base station’s antennas, and w is the K ×1 noise vector. Here we assume that the entries of w are independent CN (0, 1) random variables. The input signal is constrained to have total expected power of one, (4) E {s∗ s} = 1. Hence, if all of the power were fed into one antenna, the expected SNR of the received signal would be equal to ρf . A linear pre-coder is an M × K matrix Λ such that s = Λq where q is the K ×1 vector of message-bearing QAM symbols intended for the K mobile terminals. In the following we assume that the QAM sequences are uncorrelated with unit variance, E qq ∗ = IK . (5) The reference system is a conventional single-antenna system with forward link pilots. Its capacity lower bound is calculated in [11] as ⎧ ⎡ ⎤⎫ ⎨ ⎬ 2 ρf |ĥ| ⎦ Cconv ≥ E log2 ⎣1 + ρ ⎩ ⎭ 1 + 1+τff ρf 1 1 1 · e γ · E1 = , (6) ln 2 γ ∞ −xt where E1 (x) = 1 e t dt is the exponential integral, and τf ρ2 γ = 1+(1+τff )ρf is the expected effective SINR. For a given value of ρf the optimum forward pilot training sequence length τf is found by maximizing the spectral efficiency τ 1 − Tf · Cconv . Here we assume that the channel is constant over T symbols and τf symbols are used for forward link pilots in the conventional system. For LSAS, as discussed in [11], each terminal is assigned an orthogonal pilot sequence of duration τr symbols. The maximum number of terminals that can be simultaneously serviced is limited by the constraint K ≤ τr ≤ T. Thus for the slot consists of T symbols, τr symbols are for reverse link pilots, and the remaining T − τr symbols are for forward link data. We define the spectral efficiency of the LSAS as (1 − τr /T )CLSAS , where CLSAS is the capacity of the LSAS, and the relative energy efficiency [(bits/Joule)/(bits/Joule)] with respect to the energy efficiency of the reference system as [(T − τr )CLSAS ] / [(T − τr )ρf,LSAS ] ηrel = [(T − τf )Cconv ] / [T ρf,conv] CLSAS /ρf,LSAS = , (7) (1 − τf /T )Cconv /ρf,conv where the factor (T − τr ) for ρf,LSAS and the factor T for ρf,conv in the expression reflect the fact that forward link transmission occupies only part of the full slot T in the LSAS system since it does not transmit during reverse link pilot transmission, but occupies the full slot T in the conventional reference system for transmitting forward link pilot and forward link data. Note that what we are comparing for the two systems is the total number of bits transmitted divided by the total forward link transmitted energy. An advantage of computing relative energy-efficiency is that no link-budget definition is required. IV. C APACITY L OWER B OUNDS FOR C ONJUGATE AND Z ERO - FORCING P RE - CODERS In this section, we shall present capacity lower bounds for the LSAS described in section III, with conjugate beamforming and with zero-forcing beamforming. We take a Bayesian approach to obtaining capacity lower bounds. The terminal knows neither the channels nor the channel estimates, but it knows their statistics. In particular the terminal knows the mean of the beamforming gain with which it receives its intended QAM symbol. We re-write the expression for the terminal’s received signal as a mean gain times the desired QAM symbol, plus a composite term that constitutes effective noise. The effective noise is uncorrelated with the desired QAM symbol. Hence the overall channel is an additive-noise channel with gain that is known to both the transmitter and the receiver. The effective noise is non-Gaussian and its entropy is upper-bounded by the entropy of Gaussian noise having the same variance. Hence we can conveniently lower-bound the capacity by letting the effective-noise be Gaussian, and the expression is the classical log2 (1 + SNR) expression. A capacity lower bound for conjugate beamforming has been derived in [11], which we summarized in the following Theorem 4.1: For an LSAS (M, K, ρf , τr , ρr ) with conjugate beamforming pre-coder, the effective SINR grows linearly with the number of antennas M . Furthermore, its total capacity ρf τr ρr M Csum,cj ≥ K log2 1 + K (ρf + 1)(τr ρr + 1) For conjugate beamforming, the effective SINR at each terminal is [11] SINRcj = ρf τr ρr M . K (ρf + 1)(τr ρr + 1) (8) For Zero-forcing, the linear pre-coder Λ = c2 Ĥ ∗ (Ĥ Ĥ ∗ )−1 , where c2 is a constant scalar. Thus s = c2 Ĥ ∗ (Ĥ Ĥ ∗ )−1 q (9) From (3) and (9), we have −1 √ q+w x = c2 ρf H Ĥ ∗ Ĥ Ĥ ∗ −1 √ √ ∗ ∗ = c2 ρf q + w − c2 ρf H̃ Ĥ Ĥ Ĥ q Thus the received signal for the k-th terminal is −1 √ √ ∗ ∗ xk = c2 ρf qk + wk − c2 ρf h̃k Ĥ Ĥ Ĥ q (10) YANG and MARZETTA: PERFORMANCE OF CONJUGATE AND ZERO-FORCING BEAMFORMING IN LARGE-SCALE ANTENNA SYSTEMS Note that xk is decomposed into three uncorrelated terms in the right hand side of equation (10). The first term is the signal term. The second term represents the uncorrelated noise. The third term accounts for the channel estimation error. In comparison to the conjugate beamforming pre-coder [11], the corresponding equation for conjugate beamforming pre-coder has additional terms that account for the cross-talk due to nonorthogonal channel vectors and beamforming gain variation. From (10) we see that the effective SINR for the k-th terminal equals SINRzf = variance(1st term) variance(2nd term) + variance(3rd term) which can be evaluated as SINRzf = c22 ρf c2 ρf (1 + τr ρr ) = 2 ρf 1 + τr ρr + ρf 1 + 1+τr ρr (11) by calculating each term individually. Note that in order to obtain an explicit expression for the effective SINR, we need to compute the constant scalar c2 . This can be carried out as follows. With s given in equation (9), normalizing the power to conform to the constraint (4) we can write −1 ∗ c2 = E Trace Ĥ Ĥ Since Ĥ = τr ρr 1+τr ρr Z, where Z ∼ CN (0, 1)K×M , we have −1 −1 τ ρ 1 + τr ρr r r Ĥ Ĥ ∗ = ZZ ∗ = (ZZ ∗ )−1 1 + τr ρr τr ρr where ZZ ∗ is a central complex Wishart matrix. From [9], we see K −1 = E Trace (ZZ ∗ ) M −K 175 V. I MPLICATIONS The simplicity of the capacity lower bounds in Theorem 4.1 and Theorem 4.2 allows us to arrive at a number of important implications for LSAS. We shall summarize these implications as corollaries in the following subsections. As mentioned in section IV, we take a Bayesian approach to obtaining capacity lower bounds, if one implemented a soft decoder, then a similar Bayesian approach would be used to obtain an expression for the a-posteriori probability density, and one would typically approximate the probability density of the effective noise by a Gaussian density. Since the effective noise comprises a sum of many terms, the central limit theorem guarantees that this is a good approximation. Consequently our capacity lower-bound expressions are also fairly good predictors of the performance of a soft decoder. Furthermore, the approach to arriving at the capacity lower bounds is consistent for both pre-coders, the tightness of the bound should be similar. Without doing a full-blown simulation, the current approach provides a reasonably credible way to theoretically assess the performances of the pre-coders. Let ρf τr ρr M (14) Bsum,cj = K log2 1 + K (ρf + 1)(τr ρr + 1) ρf τr ρr M −K (15) Bsum,zf = K log2 1 + K ρf + 1 + τr ρr A. Asymptotic Performance For the asymptotic performance, we have Corollary 5.1: As the number of transmit antennas M increases without bound, the capacity performance of conjugate beamforming pre-coder and zero-forcing beamforming precoder are asymptotically identical, i.e., Bsum,cj =1 M→∞ Bsum,zf lim which implies c2 = τr ρr M −K · 1 + τr ρr K (12) Finally, combining equations (11) and (12) we obtain SINRzf = ρf τr ρr M −K · K ρf + 1 + τr ρr In fact, one can easily obtain the following interesting lemma Lemma 5.2: For any constants a1 , a2 , b1 > 0, b2 > 0, lim (13) and a capacity lower bound for the aggregate of K terminals ρf τr ρr M −K Csum, zf ≥ K log2 1 + K ρf + 1 + τr ρr We summarize the above results in the following Theorem 4.2: For an LSAS (M, K, ρf , τr , ρr ) with zeroforcing beamforming pre-coder, the effective SINR grows linearly with the number of transmit antennas minus the number of served terminals, M − K. Furthermore, its total capacity ρf τr ρr M −K Csum, zf ≥ K log2 1 + K ρf + 1 + τr ρr We shall discuss the implications of Theorem 4.1 and Theorem 4.2 in LSAS performance in the following section. x→∞ ln(a1 + b1 x) =1 ln(a2 + b2 x) Corollary 5.1 follows immediately from (14), (15) and Lemma 5.2. It is interesting to note that lim M→∞ SINRzf ρf τr ρr =1+ >1 SINRcj 1 + ρf + τr ρr Remark Note that the number of LSAS antennas M can grow independently of ρf , ρr , τr and K, from equations (8) and (13), we see that as M grows without bound, both SINRzf and SINRcj can be made arbitrarily large. B. General Performance Comparison Comparing the capacity lower bounds of the conjugate beamforming and zero-forcing beamforming pre-coders, we have 176 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 2, FEBRUARY 2013 Corollary 5.3: For a given number of transmit antennas M and a given number of served terminals K < M , Bsum,zf ≥ Bsum,cj if and only if SINRzf ≥ 1. In fact, from equations (8) and (13) we see SINRzf K (SINRzf − 1) . =1+ SINRcj M Thus if SINRzf ≥ 1, then SINRzf ≥ SINRcj , which implies Bsum,zf ≥ Bsum,cj . Conversely, if SINRzf < 1, then SINRzf < SINRcj , which implies Bsum,zf < Bsum,cj . Since SINR is an increasing function of forward link SNR ρf , based on Corollary 5.3 we expect to see zero-forcing beamforming pre-coder outperforming conjugate beamforming pre-coder in the high SNR region while conjugate beamforming pre-coder outperforming zero-forcing beamforming pre-coder in the low SNR region. This is indeed the case as it is shown in the examples in Section VI. In the high SNR region, zero-forcing beamforming performs much better than conjugate beamforming. In low SNR region, the two precoders perform similarly, with conjugate beamforming slightly outperforms zero-forcing beamforming. C. Optimal Number of Spatial Multiplexing Terminals Given the same number of LSAS antennas M , the same forward link SNR ρf and the same reverse link SNR ρr for the two pre-coders, their spectral efficiencies, represented by (1 − τr /T )Bsum,cj and (1 − τr /T )Bsum,zf , can be maximized over the number of spatial multiplexing terminals K and the number of symbols used for reverse link pilots τr simultaneously. Let Kcj and Kzf be the corresponding optimal number of spatial multiplexing terminals, we shall give a comparison of Kcj and Kzf in this subsection. The following corollary gives a necessary condition for a pair (K, τr ) to be optimal in the conjugate beamforming case. It states that for conjugate beamforming pre-coder, the optimal number of terminals Kcj must equal to the number of reverse link pilots used for channel state estimation. Corollary 5.4: For conjugate beamforming pre-coder, if (Kcj , τr ) maximizes the spectral efficiency (1 − τr /T )Bsum,cj , then Kcj = τr . Corollary 5.4 is obtained by observing that the factor (1 − τr /T ) is independent of the number of served terminals K and that Bsum,cj is an increasing function of K, and that K is subject to the constraint K ≤ τr . Recall from Section III that for a channel that is constant over T symbols, it is necessary that τr ≤ T . Thus τr cannot grow without bound. In fact, τr = T implies that all the resources are used for channel estimation, leaving no resources for data transmission. We remark that for zero-forcing beamforming pre-coder, the Corollary 5.4 remains valid in most cases but does not hold true in general since Bsum,zf may not be increasing with respect to the number of served terminal K when ρf is small enough. In general, in the high SNR region, i.e., when ρf >> 1, we have SINRzf > SINRcj . In this case the optimal number of terminals for the LSAS with zero-forcing beamforming pre-coder to achieve maximum capacity is larger than that for the LSAS with conjugate beamforming pre-coder. The opposite is true in the low SNR region since SINRzf < SINRcj when ρf << 1. As the number of transmit antennas M increases and the forward SNR ρf decreases toward zero, we have SINRzf ≈ SINRcj . In this case the optimal number of terminals for the two beamforming pre-coders becomes comparable. D. Maximal Spatial Multiplexing Gain A prominent advantage of the LSAS is the substantial capacity gain due to spatial multiplexing. The maximal spatial multiplexing gain is defined, for a given ρf , as the ratio of the maximal spectral efficiency, which is achieved when there are optimal number of terminals in the system, and the spectral efficiency when there is only one terminal is the system. As mentioned in the previous subsection, in the high SNR region, i.e., when ρf >> 1, we have SINRzf > SINRcj . In this case the maximal spatial multiplexing gain is larger for the LSAS with zero-forcing beamforming pre-coder. The opposite is true in the low SNR region since SINRzf < SINRcj when ρf << 1. As the number of transmit antennas M increases and the forward SNR ρf decreases toward zero, we have SINRzf ≈ SINRcj . In this case the maximal spatial multiplexing gains for the two beamforming pre-coders becomes comparable. We note that in general, more users do not necessary translate into higher multiplexing gain. This is can be seen in the examples in Section VI (Figures 3 and 4). E. Single Mobile Terminal Case A deeper understanding of the difference between conjugate pre-coder and zero-forcing pre-coder can be gained by considering the case when there is only one terminal in the system, i.e., when K = 1. In this case the conjugate and zero-forcing beamforming vectors are the same, except for their scalar multipliers. However, zero-forcing beamforming pre-coder performs better except for the very low SNR (e.g., SNR = -20dB in our examples in the next section) case, for which the performance are comparable between the two beamforming pre-coders. The reason is as follows. The corresponding equation (10) for conjugate beamforming contains two additional “interference terms” [11]: (a) cross interference from other served mobiles (b) variance of the beamforming gain Zero-forcing beamforming forces these two terms to zero. Thus these two terms are absent in equation (10). For K = 1, there is no cross interference, (a) is also zero for conjugate beamforming; (b) is the reason for the different performances between conjugate beamforming and zero-forcing pre-coders. On the one hand, due to different power normalization of the two pre-coders, the beamforming gain for zero-forcing is smaller by a factor (M − K)/M . On the other hand, the variance of zero-forcing beamforming gain is zero while the variance of conjugate beamforming gain [11] is 1 ρf τr ρr K τr ρr + 1 Thus for large enough ρf , the advantage of having zero variance in beamforming gain more than compensates the beamforming gain reduction factor (M − K)/M for zeroforcing, which can make zero-forcing beamforming perform better. But for very small ρf , the variance of conjugate YANG and MARZETTA: PERFORMANCE OF CONJUGATE AND ZERO-FORCING BEAMFORMING IN LARGE-SCALE ANTENNA SYSTEMS Fig. 1. Trade-off between relative energy-efficiency and net spectralefficiency for conjugate and zero-forcing beamforming (M = 100) 177 Fig. 2. Trade-off between relative energy-efficiency and net spectralefficiency for conjugate and zero-forcing beamforming (M = 400) beamforming gain is close to zero, having zero variance in beamforming gain does not gain much for zero-forcing precoder, and the reduction factor (M − K)/M for zeroforcing beamforming can actually make it perform worse than conjugate beamforming. VI. E XAMPLES Using the capacity bounds obtained above, we construct two examples which illustrate the performance of conjugate beamforming and zero-forcing beamforming pre-coders through trade-off curves that constitute relative energy-efficiency expressed in (bits/Joule)/(bits/Joule) versus net spectral efficiency (bits/second/Hz). The first example (Figure 1) is for the number of base station transmit antennas M = 100 and the second example (Figure 2) is for M = 400. The reverse link expected SNR is ρr = 1.0 (0 dB) and the equivalent slot-duration is T = 196 symbols. The choice of T = 196 = 14 × 14 is based on typical LTE OFDM parameters and a very short coherence interval comprising 14 OFDM symbols occupying a total of 1 millisecond where it is assumed that the channels are substantially constant over 14 consecutive 15 KHz subcarriers. Time-frequency pilots have to be transmitted within each 14-tone sub-band, and there is room for at most 196 orthogonal pilot sequences. The reference scenario is a conventional single-antenna system with forward link pilot and the expected forward link SNR is assumed to be ρf = 10.0 (10 dB). The spectral efficiency of the reference system is calculated to be 2.6474 bits/second/Hz, which is achieve when τf = 9 [11]. On each curve in Figures 1, 2, 3 and 4, each marker from left to right corresponds to the total forward power level ρf equals 0.01, 0.1, 1, 10, 100 and 1000 respectively, where a power level of 10 implies that the total radiated power is identical to that in the reference scenario. To achieve 1000 fold relative energy efficiency, with number of transmit antennas M = 400, we see that for an LSAS with conjugate beamforming pre-coder, the required ρf = 0.276 and the corresponding maximal spectral efficiency of 53.35 bits/s/Hz is achieved with Fig. 3. Optimal number of terminals K vs. total radiated power 53 spatial multiplexing users. For an LSAS with zero-forcing beamforming pre-coder, the required ρf = 0.303 and the corresponding maximal spectral efficiency of 60.23 bits/s/Hz is achieved with 49 spatial multiplexing users. Figures 3 shows the optimal number of terminals for each beamforming pre-coder to achieve maximal spectral efficiency at each forward power level for M = 100 and M = 400. As expected, in the high SNR region, the optimal number of terminals for the LSAS with zero-forcing beamforming pre-coder is larger than that for the LSAS with conjugate beamforming pre-coder. The opposite is true in the low SNR region. Figures 4 shows the maximum spatial multiplexing gain for each beamforming pre-coder at each forward power level for M = 100 and M = 400. As expected, the spatial multiplexing gain is larger for the LSAS with zero-forcing beamforming pre-coder in the high SNR region while the opposite is true in the low SNR region. 178 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 2, FEBRUARY 2013 TABLE I C OMPUTATION BURDEN COMPARISON FOR M = 100 AND ρf = 1 M = 100, ρf = 1 Bandwidth (MHz) 20 10 Conjugate Total (Gflops) 350.0 171.3 Zero-forcing Total (Gflops) 307.1 149.8 5 83.8 73.0 K 46 30 τr 46 30 TABLE II C OMPUTATION BURDEN COMPARISON FOR M = 400 AND ρf = 1 M = 400, ρf = 1 Bandwidth (MHz) 20 10 Conjugate Total (Gflops) 411.1 201.8 Zero-forcing Total (Gflops) 562.3 277.4 Fig. 4. Spatial multiplexing gain vs. total radiated power • VII. C OMPUTATIONAL B URDEN OF C ONJUGATE AND Z ERO - FORCING P RE - CODERS In addition to the ”over-the-air” capacity and radiated energy efficiency performance studied in the previous sections, another important energy efficiency performance indicator for a pre-coder is its internal power consumption. Some internal activities have a power consumption that is more or less independent of M and K. Others, such as LSAScritical computations are strongly dependent on M and K. The internal power consumption obviously depends on the computation burden, measured in Gflops (giga floating point operations per second), of the pre-coders, and the energyefficiency of computing measured in Gflops/Watt. In this section, we shall compare the computation burden of the conjugate beamforming pre-coder and zero-forcing precoder. For this purpose we introduce the following mathematical symbols, in addition to the notation listed in Section II Tsl : slot time, i.e., the time duration a slot occupies. B: Carrier bandwidth Ts : symbol duration Tu : usable symbol duration Td : OFDM guard time, where B is measured in Hz, and the times are measured in seconds. Note that with the definitions above, the slot symbolduration is T = TTsls · TTud , where TTsls is the number of OFDM symbols in the slot, and under the assumption that the channel delay-spread is equal to the guard interval, TTud is equal to the Nyquist-sampling frequency interval in units of tones. With OFDM modulation, within a slot time Tsl , the following LSAS-critical computations need to be carried out for every beamforming pre-coder. Counting in terms of floating point operations, we have • FFT’s/IFFT’s: Tsl M· · (Tu B) · log2 (Tu B) Ts • here TTsls is the number of symbols in Tsl and Tu B is the number of subcarriers in the total bandwidth B. Correlation of pilot signals with pilot sequences: (a) Direct correlation: M · (Td B) · K · τr 5 99.0 136.8 K 63 63 τr 63 63 (b) Fast correlation: M · (Td B) · τr · log2 (τr ) here Td B is the number of constant frequency blocks in the total bandwidth B. Implementation of pre-coding/de-coding: M · K · (T − τr ) · (Td B) Conjugate beamforming does not require any other computation while zero-forcing beamforming requires an additional pseudo-inverse, which requires M · K 2 · (Td B) floating point operations. Assuming that the forward SNR ρf = 1, with the corresponding optimal number of terminals K and optimal number of reverse link pilot τr derived in the previous section, we summarize the computation burdens for conjugate beamforming pre-coder and zero-forcing beamforming precoder in Tables I & II. Table I, which is for M = 100, shows that zero-forcing requires roughly 12% to 13% less computation than conjugate beamforming. Table II, which is for M = 400, shows that zero-forcing requires roughly 37% to 38% more computation than conjugate beamforming. The results are surprising in two ways. While it is obvious that for an LSAS, with the same number of served terminals K, zero-forcing beamforming pre-coder requires more computations than conjugate beamforming pre-coder, albeit that the increase is surprisingly small, but at the maximum achievable capacity, zero-forcing beamforming pre-coder actually requires less computation since it serves fewer terminals than conjugate beamforming pre-coder. The above calculations include only pre-coder related computations. We note that the above consideration on the internal energy efficiency of LSAS is by no means complete. For example, energy consumed in error-correction coding and de-coding, A/D and D/A converters, pre-amplifiers, mixers, oscillators, etc., should also be considered. VIII. C ONCLUSIONS AND D ISCUSSION Both conjugate beamforming and zero-forcing are viable linear pre-coding schemes for LSAS as exemplified by tradeoff curves between energy-efficiency and spectral-efficiency. Where the highest spectral-efficiency is desired, zero-forcing can perform appreciably better than conjugate beamforming. If, however, a more energy-efficient operating point is chosen YANG and MARZETTA: PERFORMANCE OF CONJUGATE AND ZERO-FORCING BEAMFORMING IN LARGE-SCALE ANTENNA SYSTEMS then the performance gap narrows, and eventually conjugate beamforming outperforms zero-forcing. Our quantification of the computational burden of LSAS yields some surprises. The extra computations required to obtain the channel pseudo-inverse for zero-forcing do not add greatly to the total burden. In the regime of high energyefficiency conjugate beamforming can have a greater total computational burden than zero-forcing because of the larger number of terminals served. Irrespective of computational considerations, conjugate beamforming may be preferable to zero-forcing because of its greater robustness, and that it lends itself to a de-centralized architecture and de-centralized signal processing whereby every service antenna keeps its channel estimates to itself, and performs its own signal processing independently. Service antennas can be added or deleted without changing the function of the system; indeed the central unit which is running the system almost does not have to know the number of service antennas employed. Our simplified scenario ignores range-dependent attenuation as well as shadow-fading, so there are no near-far effects to contend with. Further investigations that include near-far effects are warranted (in addition to multi-cell scenarios) but because near-far effects will tend to increase the conditionnumber of the propagation matrix, it is possible that the simplified scenario shows zero-forcing in its best light. ACKNOWLEDGMENT The authors would like to thank anonymous referees for their very constructive comments and corrections of errors. R EFERENCES [1] B. Hochwald, C. Peel, and A.L. Swindlehurst, “A Vector-perturbation technique for near-capacity multi-antenna multi-user communication part II: perturbation”, IEEE Transactions on Communications, March, 2005. [2] J. Hoydis, S. ten Brink and M. Debbah, “Massive MIMO: How many antennas do we need?”, arXiv:1107.1709v2 [cs.IT] 23 Sep 2011. [3] H. Huh, G. Caire, H. C. Papadopoulos and S. A. Ramprashad, “Achieving ‘massive MIMO’ spectral efficiency with a not-so-large number of antennas”, arXiv:1107.3862v2 [cs.IT] 13 Sep 2011. [4] T. L. Marzetta, “How much training is required for multiuser MIMO?”, in Fortieth Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, pp. 359-363, Oct. 2006. [5] T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station antennas”, IEEE Trans. Wireless Commun., vol. 9, no. 11, pp. 3590-3600, Nov. 2010. [6] H. Q. Ngo, E. G. Larsson and T. L. Marzetta, “Energy and Spectral Efficiency of Very Large Multiuser MIMO Systems”, submitted to IEEE Transactions on Communications, arXiv:1112.3810v2 [cs.IT] 21 May 2012. [7] C. Peel, B. Hochwald, and A.L. Swindlehurst, “A Vector-Perturbation Technique for Near-Capacity Multi-Antenna Multi-User Communication Part I: Channel Inversion and Regularization”, IEEE Transactions on Communications, January, 2005. 179 [8] F. Rusek, D. Persson, B. K. Lau, E. G. Larsson, T. L. Marzetta, O. Edfors and F. Tufvesson, “Scaling up MIMO: Opportunities and challenges with very large arrays”, IEEE Signal Processing Magazine, to appear 2012 [9] A. M. Tulino and S. Verdu, Random Matrix Theory and Wireless Communications, Now Publishers Inc., 2004. [10] S. Vishwanath, N. Jindal, and A. Goldsmith, “Duality, achievable rates, and sum-rate capacity of Gaussian MIMO broadcast channels”, IEEE Trans. on Information Theory, Volume 49, Issue 10, pp. 2658-2668, Oct. 2003. [11] H. Yang and T. L. Marzetta, “Quantized array processing gain in largescale antenna systems”, Bell Labs technical report, September 2011. Hong Yang is a member of technical staff in the Alcatel-Lucent Bell Labs Mathematics of Networks and Communications Research Department in Murray Hill, New Jersey. Prior to that, he was with Alcatel-Lucent’s Radio Frequency Technology Systems Engineering Department, also in Murray Hill. He received his Ph.D. in applied mathematics from Princeton University, Princeton, New Jersey. Dr. Yang has worked extensively on LTE/CDMA wireless network air interface performance, including coverage link budget and voice/data capacity. He also worked several years developing algorithms for RF design tools. Prior to his tenure at Alcatel-Lucent, he worked in academia and for a start-up company. He has published research papers in well known electrical engineering, applied mathematics, and financial economics journals and conference proceedings. His current research interests include energy efficient wireless network design optimization and energy-coverage-capacity performance of Large-Scale Antenna Systems (LSAS). Thomas L. Marzetta was born in Washington, D.C. He received the PhD in electrical engineering from the Massachusetts Institute of Technology in 1978. His dissertation extended, to two dimensions, the three-way equivalence of autocorrelation sequences, minimum-phase prediction error filters, and reflection coefficient sequences. He worked for Schlumberger-Doll Research (1978 - 1987) to modernize geophysical signal processing for petroleum exploration. He headed a group at Nichols Research Corporation (1987 - 1995) which improved automatic target recognition, radar signal processing, and video motion detection. He joined Bell Laboratories in 1995 (formerly part of AT&T, then Lucent Technologies, now Alcatel-Lucent). He has had research supervisory responsibilities in communication theory, statistics, and signal processing. He specializes in multiple-antenna wireless, with a particular emphasis on the acquisition and exploitation of channel-state information. He is the originator of Large-Scale Antenna Systems which can provide huge improvements in wireless energy efficiency and spectral efficiency. Dr. Marzetta was a member of the IEEE Signal Processing Society Technical Committee on Multidimensional Signal Processing, a member of the Sensor Array and Multichannel Technical Committee, an associate editor for the IEEE Transactions on Signal Processing, an associate editor for the IEEE Transactions on Image Processing, and a guest associate editor for the IEEE Transactions on Information Theory Special Issue on Signal Processing Techniques for Space-Time Coded Transmissions (Oct. 2002) and for the IEEE Transactions on Information Theory Special Issue on SpaceTime Transmission, Reception, Coding, and Signal Design (Oct. 2003). Dr. Marzetta was the recipient of the 1981 ASSP Paper Award from the IEEE Signal Processing Society. He was elected a Fellow of the IEEE in Jan. 2003.
© Copyright 2026 Paperzz