Turbo Coded Multiple-Antenna Systems for Near

954
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 27, NO. 6, AUGUST 2009
Turbo Coded Multiple-Antenna Systems for
Near-Capacity Performance
Yeong-Luh Ueng, Chia-Jung Yeh, Mao-Chao Lin, and Chung-Li Wang
Abstract—For a turbo coded BLAST (Bell LAbs Space-Time
architecture) system with Nt transmit antennas and Nr receive
antennas, there is a significant gap between its detection threshold
and the capacity in case Nt > Nr . In this paper, we show
that by introducing a convolutional interleaver with block delay
between the BLAST mapper and the turbo encoder, the threshold
can be improved. Near-capacity thresholds can be achieved for
some cases. To take advantage of the low detector complexity in
Alamouti STBC (space-time block code), we also investigate a
STBC system, which is the concatenation of the Alamouti STBC
with a turbo trellis coded modulation. By using a proper labelling
and adding a convolutional interleaver with block delay to such
a STBC system, we achieve both lower error floors and lower
thresholds.
Index Terms—Iterative decoding, iterative detection, multipleinput multiple-output (MIMO), turbo codes, turbo principle, Bell
Labs Space-Time architecture (BLAST), space-time block code
(STBC).
I. I NTRODUCTION
ULTI-input multi-output (MIMO) systems with Nt
transmit antennas and Nr receive antennas are attractive for their capability to achieve higher data rates. In
[1], a MIMO system called BLAST (Bell LAbs Space-Time
architecture) has been proposed to provide spatial multiplexing
for achieving higher data rates. BLAST mapper can be serially
concatenated with an outer channel encoder to obtain time
diversity in the fast fading channel [2]-[9]. In [5], the channel
code used is a turbo code and the resultant scheme is a turbo
coded BLAST system. It was noted in [5] that, in the fast
fading channel, the extrinsic information transfer (EXIT) [32]
curve of the MIMO detector is not flat for the case of Nt > Nr
even if Gray mapping is employed. However, the EXIT curve
of a turbo code is almost a horizontal line and the code EXIT
curve is, therefore, poorly matched to the detector EXIT curve.
Hence, the decoding thresholds are distant from the capacities
for the case of Nt > Nr [5].
For either irregular low-density parity-check (LDPC) [38] or
irregular repeat accumulate (IRA) [39] coded BLAST systems,
M
Manuscript received 30 September 2008; revised 16 January 2009. This
work was supported by National Science Council of the R.O.C. under grants
NSC 95-2221-E-007-035 and NSC 96-2221-E-002-092.
Yeong-Luh Ueng is with the Department of Electrical Engineering and
the Institute of Communications Engineering, National Tsing Hua University,
Hsinchu, Taiwan, R.O.C. (e-mail:[email protected]).
Chia-Jung Yeh is with the Graduate Institute of Communication
Engineering, National Taiwan University, Taipei, Taiwan, R.O.C. (email:[email protected]).
Mao-Chao Lin is with the Department of Electrical Engineering, National
Taiwan University, Taipei, Taiwan, R.O.C. (e-mail:[email protected])
Chung-Li Wang is with the Department of Electrical and Computer Engineering University of California, Davis, CA, USA (email:[email protected]).
Digital Object Identifier 10.1109/JSAC.2009.090813.
there is room for arranging the degree distributions of variable
or check nodes to match the EXIT curve of the MIMO detector
and hence such BLAST systems can achieve near-capacity
performance [8][9]. However, the LDPC codes or IRA codes
optimized for the multiple-antenna systems with Nt > Nr
usually have many degree-2 nodes. Hence, in case of short
block sizes, such LDPC coded BLAST (multiple-antenna)
systems have high error floors [31].
Unlike the approaches used in [8][9], in this paper, we
use an alternative approach to design coded BLAST systems
for near-capacity performance. We introduce a convolutional
interleaver [37] with block delay between the BLAST mapper
and the binary turbo encoder of the turbo coded BLAST
system in [5]. We show that, the EXIT curves of detector and
turbo decoder in the proposed system can match well for the
cases of Nt = 2 with Nr = 1 or 2. Hence, near-capacity performance can be achieved. For some other cases, our system
can still achieve better decoding thresholds as compared to the
system in [5]. In the delay diversity scheme of [33][34], copies
of the same symbols are transmitted through multiple antennas
at different times to provide spatial diversity to combat fading
for reliable communication. In [44][45], channel coding is
integrated into the delay diversity scheme to provide coding
gain in addition to the diversity gain. The delay schemes in
[33][34][44][45] can be viewed as a convolutional interleaver
with symbol delay rather than block delay. Moreover, the
multiple-antenna systems in [33][34][44][45] are not designed
to achieve near-capacity performance.
For MIMO systems, space-time block codes (STBC)
[10][11] and space-time trellis codes (STTC) [12] can also
be used to provide spatial diversity. For STBC, there is a
subclass called orthogonal STBC, which has the advantage of
low detector complexity. The orthogonal STBC with Nt = 2
is the famous Alamouti code [10]. Orthogonal STBC can be
concatenated with a bandwidth-efficient outer code such as
trellis coded modulation (TCM) [13], bit-interleaved coded
modulation (BICM) [14]-[18], turbo TCM (TTCM) [19]-[23],
or LDPC-based coded modulation [24] to enhance its coding
gain or diversity gain [25]-[31][42]. It was demonstrated in
[31][43] that for the MIMO detector consisting of an Alamouti
STBC decoder and a demapper of Gray-labelled modulation,
the associated EXIT curve is almost flat. Hence, it is possible
to design a STBC system which is the concatenation of a
turbo trellis coded modulation and the Alamouti STBC [29]
using Gray-labelled modulation to achieve a good decoding
threshold since the EXIT curves of the MIMO detector and
the binary turbo decoder can match well.
In a way similar to that for turbo coded BLAST, we propose
c 2009 IEEE
0733-8716/09/$25.00 UENG et al.: TURBO CODED MULTIPLE-ANTENNA SYSTEMS FOR NEAR-CAPACITY PERFORMANCE
to insert a convolutional interleaver with block delay between
the binary turbo encoder and signal mapper of the turbo coded
Alamouti STBC system in [29]. However, unlike the scheme in
[29], mixed labeling [16] instead of Gray labeling is employed
in the signal mapper of the proposed system so that lower error
floor can be obtained. The detector EXIT curve of Alamouti
STBC with Gray labeling is close to horizontal while the
detector EXIT curve of Alamouti STBC with mixed labeling
is not. However, the block delay operation in the proposed
system can make the detector EXIT curve close to horizontal
even if mixed labeling is used. Hence, the EXIT curves of
detector and turbo decoder in the proposed system can match
well. In [28], a convolutional interleaver is applied to the
Alamouti STBC in conjunction with a tail-biting TCM. The
tail-biting TCM design in [28] is not suitable for constructing
multiple-antenna systems with good thresholds at long code
lengths.
We can apply the turbo principle to the proposed turbo
coded multiple-antenna systems by iteratively performing
decoding and detection between the turbo decoder and the
MIMO detector. For the turbo coded BLAST system, the
MIMO detector is a BLAST demapper while for the turbo
coded STBC system, the MIMO detector consists of a signal
demapper and a STBC decoder. The soft output of the turbo
decoder can be used to update the log-likelihood ratio (LLR)
output of the MIMO detector. Since the introduction of
delay elements in the proposed systems, the adjacent turbo
codewords are correlated and hence optimum decoding is very
difficult. We can resort to suboptimum decoding. The simplest
is that the MIMO detector and the turbo decoder exchange
extrinsic information within each single turbo codeword. Such
a decoding method is called iterative decoding within a single
codeword (IDSC). We can improve the error performance
by exchanging extrinsic information between adjacent turbo
codewords. Such a decoding method is called iterative decoding between adjacent codewords (IDAC). Two types of
IDAC with different decoding delays and complexities will be
investigated.
The remainder of this paper is organized as follows. The
turbo coded BLAST system in [5] and the proposed turbo
coded BLAST system are described in Sections II and III,
respectively. A turbo coded Alamouti STBC system in [29]
and the proposed turbo coded Alamouti STBC system are
discussed in IV. This paper concludes in Section V.
II. A T URBO C ODED BLAST S YSTEM
In this section, we review a turbo coded BLAST system
[5], for which the transmitter is implemented by serially
concatenating an interleaver and a BLAST mapper to a binary
turbo encoder.
955
[b̂0 , · · · , b̂Nt −1 ]T = [bo , · · · , bK−1 ]T denote such a group,
where each b̂i is a binary m-tuple, bk ∈ {0, 1}, and T denotes
the transpose. The output of the BLAST mapper is represented
by the symbol vector s̄ = [s0 , · · · , sNt −1 ]T , where si is a
constellation point labelled by b̂i and is transmitted through
the (i + 1)-th transmit antenna. Let Es be the average energy
of s̄. Furthermore, we require that E[|si |2 ] = Es /Nt for i =
0, 1, · · · , Nt − 1. In this paper, we consider the time required
for the transmission of N code bits as one block unit. There
are q MIMO transmissions (channel uses) within one block
unit and for each MIMO transmission (channel use), K code
bits are transmitted.
B. Channel Model
In this paper, we consider the Rayleigh fading channel.
Elements of the Nr × Nt channel matrix H are independently
and identically distributed zero-mean complex Gaussian random variables with independent real and imaginary parts each
having variance of 0.5. We assume fast fading. Hence, the
channel matrices H at different time instants are independent. In addition, it is assumed that H is unknown to the
transmitter and is known perfectly to the receiver. Let n̄ be
an Nr -tuple consisting of Gaussian entries with covariance
matrix Q= E[n̄∗ n̄] =N0 INr , where n̄∗ denotes the conjugate
transpose of n̄ and INr is a Nr × Nr identity matrix. Let
rj be the received signal of the (j + 1)-th receive antenna.
We have r̄ = H s̄ + n̄, where r̄ = [r0 , · · · , rNr −1 ]T . The
normalized signal-to-noise ratio (SNR) Eb /N0 is defined as
Nr
Eb
Es
N0 |dB = N0 |dB + 10 log10 RNt m [8].
C. MIMO Detector: BLAST Demapper
Let c̄i be the (K − 1)-bit binary representation of i and b̄k−
= [b0 , · · · , bk−1 , bk+1 , · · · , bK−1 ]T . For each code bit bk ,
r(bk =1)
be the a
k = 0, 1, · · · , K − 1, let LM,a (bk ) = ln PP r(b
k =0)
priori LLR and
L̄M,a (b̄k− ) =
[LM,a (b0 ), · · · , LM,a (bk−1 ), LM,a (bk+1 ), · · · , LM,a (bK−1 )]T .
(1)
From the channel model, we have the conditional
probability density function P r(r̄
|
H, s̄)
=
∗ −1
1
For
a
BLAST
det(2πQ)−1/2 e− 2 (r̄−H s̄) Q (r̄−H s̄) .
demapper, the a posteriori LLR for each code bit bk is
given by
LM,p (bk | r̄) = LM,a (bk ) + ln
y(1)
y(0)
(2)
A. Transmitter
where
The w-bit message ū is encoded by a rate-R binary turbo
encoder to yield a turbo codeword of N code bits, where N
= qK, K = mNt , and 2m is the constellation size for each
transmit antenna. Since an interleaver is inserted between the
turbo encoder and the BLAST mapper, the turbo codeword
is interleaved before being divided into q groups. Let b̄ =
2K−1 −1
r̄−H·map([(c̄i )0:k−1 l (c̄i )k:K−2 ])2
exp(−
)
y(l) =
i=0
2N0
exp[c̄i · L̄M,a (b̄k− )], l = 0, 1, map(·) denotes the BLAST
mapping of the associated mNt -tuple, and (c̄i )a:b denotes the
(b − a + 1)-tuple containing the a-th to the b-th components
of c̄i .
956
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 27, NO. 6, AUGUST 2009
0.6
l
l
(u ,p )]
0.5
Detector, (Nt , Nr ) = (2, 1), Ia (v̄2 ,+1) =0, propos ed
0.3
Detector, (Nt , Nr ) = (2, 1), Ia (v̄2 ,+1) =0.5, propos ed
Detector, (Nt , Nr ) = (2, 1), Ia (v̄2 ,+1) =1, propos ed
0.2
I
M,e
l
l
(u ,p ) [I
D,a
0.4
Detector, (Nt , Nr ) = (2, 1), original
Detector, (Nt , Nr ) = (1, 1), original
0.1
Detector, (Nt , Nr ) = (4, 1), original
Decoder
0
0
0.1
0.2
0.3
0.4
I
0.5
(u ,p ) [I
M,a
l
l
0.6
0.7
0.8
0.9
1
(u ,p )]
D,e
l
l
Eb
Fig. 1. EXIT curves of MIMO detector (BLAST demapper at N
= 3.9
0
dB) and turbo decoder (NI = 20) for the proposed turbo coded BLAST
system and the original turbo coded BLAST system in [5].
D. Iterative Detection and Decoding
The receiver consists of a turbo decoder and a MIMO
detector. The MIMO detector and the turbo decoder iteratively
exchange extrinsic information by the turbo principle. The
soft-in/soft-out MIMO detector computes the a posteriori
values LM,p and then sends LM,e = LM,p − LM,a to the
turbo decoder for the code bits representing s̄, where LM,e
consists of channel information and extrinsic information.
Then, LM,e is deinterleaved and then is taken as the a priori
value LD,a of the decoder for further iterative decoding steps.
Through the soft-in/soft-out decoding with NI iterations of
the turbo decoder, we have LD,p and LD,e = LD,p − LD,a .
After interleaving, LD,e is fed back to the MIMO detector and
will be used as LM,a . With updated LM,a , the detector will
update its LLR output, LM,e .
E. EXIT Curves for BLAST Demapper and Turbo Decoder
The convergence performance of iterative detection and decoding can be analyzed by using EXIT charts based on mutual
information [5]. Let Ia [ck , La (ck )] denote the mutual information between bit ck and its a priori L-value (LLR) La (ck ).
Similarly, Ie [ck , Le (ck )] denotes the mutual information between bit ck and its extrinsic L-value Le (ck ). In addition, we
use
the average mutual information
a (c̄) and Ie (c̄) to denote NI−1
N −1
I
[c
,
L
(c
)]/N
and
a k
k=0 a k
k=0 Ie [ck , Le (ck )]/N of bits
in c̄ ≡ (c0 , c1 , · · · , cN −1 ), respectively. In the calculation of
Ia (c̄) and Ie (c̄), N =2097152.
Let (ū , p̄ ) denote the -th turbo codeword with N bits
for which bits in (ū , p̄ ) are used as the labeling bits of the
BLAST mapper, where ū and p̄ represent the interleaved
message bits and the interleaved parity bits, respectively. Let
IM,a ((ū , p̄ )) and IM,e ((ū , p̄ )) denote the a priori and
extrinsic mutual information of the demapper, respectively. In
addition, we denote the demapper extrinsic transform characteristic or EXIT curve of the turbo coded BLAST system in [5]
by TM , i.e., IM,e = TM (IM,a , Eb /N0 , Nt , Nr ). We obtain TM
by Monte Carlo simulation based on the assumption that the a
priori values LM,a of the demapper are Gaussian distributed
2
2
and mean value σM,a
/2 [5]. Fig. 1 shows
with variance σM,a
the demapper EXIT curves TM with Gray-mapped QPSK for
the cases of (Nt , Nr ) = (4, 1), (2, 1), and (1, 1) respectively.
We see that these curves resemble straight lines and meet at
IM,a = 1. It can be shown that any 1 × Nr curve for Graymapped QPSK is a horizontal line [8][36]. For example, the
1 × 1 curve TM
(Nt = 1, Nr = 1) is a horizontal line with
Eb
value of E[J( 8R N
|h|2 )], where h is a zero mean, unit
0
variance, complex Gaussian random variable and function J(·)
is given by
∞
exp[−(z − σ 2 /2)2 /2σ 2 ]
√
log2 [1+exp(−z)]dz.
J(σ) = 1−
2πσ 2
−∞
(3)
Es
Es
2
Note that J( 8 N0 ) and E[J( 8 N0 |h| )] are the capacities
of BPSK signals over the 1 × 1 additive white Gaussian noise
(AWGN) channel and Rayleigh fading channel, respectively
[8]. Also included in Fig. 1 is the EXIT curve ID,e =
TD (ID,a ) of a rate-1/2 turbo decoder, where ID,a ((ū , p̄ ))
and ID,e ((ū , p̄ )) denote the a priori and extrinsic mutual
information of the turbo decoder, respectively. The generator
matrix of the 4-state constituent codes used in the turbo
decoder is (1, 5/7)8 . It is obvious that TD is not a function
of Eb /N0 . In contrast, an increase in Eb /N0 will result in
a vertical shift of the demapper EXIT curves toward higher
output values.
From the area property described in [35][36],
the areas
1
1
AM = 0 TM (IM,a )dIM,a and AD = 0 TD (ID,a )dID,a
provide good approximations of I(s̄, r̄)/K and (1 − R),
respectively, where I(s̄, r̄) is the mutual information between
the input symbol s̄ and output symbol r̄ of the channel, and
convergence of iterative detection and decoding is possible
for AM > (1 − AD ), i.e., R < I(s̄, r̄)/K. Since an area
gap between the inner demapper and outer decoder EXIT
curves directly relates to a rate loss, both curves should be
matched to each other to minimize this gap [36]. From the
curves respectively labelled by ”Detector, (Nt , Nr ) = (2, 1),
original” and ”Detector, (Nt , Nr ) = (4, 1), original” in Fig. 1,
we see that the EXIT curve of the BLAST demapper is not
flat for the case of Nt > Nr . However, the EXIT curve of
a turbo code is almost a horizontal line [31] and the code
EXIT curve is, therefore, poorly matched to the demapper
EXIT curve. Hence, the rate loss is large and the threshold is
distant from the capacity for the case of Nt > Nr [5]. Taking
(Nt , Nr ) = (2, 1) as example, the capacity of such a MIMO
system is 3.25 dB, while the threshold of this turbo coded
BLAST system is 4.4 dB.
III. A T URBO C ODED BLAST S YSTEM FOR
N EAR -C APACITY P ERFORMANCE
In the following, we propose to introduce a convolutional
interleaver with block delay between the turbo encoder and
the BLAST mapper to effectively flatten the demapper EXIT
curves. The proposed BLAST system can achieve nearcapacity performance for (Nt , Nr ) = (2, 1) and (2,2) with
R = 1/2. For many other cases, our system can achieve better
decoding thresholds as compared to the original turbo coded
BLAST system in [5]. In Sections III.A to III.F, we use the
UENG et al.: TURBO CODED MULTIPLE-ANTENNA SYSTEMS FOR NEAR-CAPACITY PERFORMANCE
Binary
source
interleaver
Ȇ1
Turbo
Encoder
hard
decision
LD , p ( p" )
LD ,a (u" )
Turbo
Decoder
p"
Ȇ2
LD , p (u" )
LD ,a ( p" )
+
+
u"
v1, "
DB
BLAST
Mapper
v2, "
Channel
3 11
3
1
2
LD ,e ( p" )
LD ,e (u" )
LM ,e (u" )
LM ,e ( p" )
Ȇ2
Ȇ1
LM , p (u" )
+
+
LM , p ( p" )
BLAST
Demapper
LM ,a ( p" )
LM ,a (u" )
memory
Fig. 2.
957
LM , a (u" 1 )
Transmitter and receiver of the proposed turbo coded BLAST system. (Nt , Nr ) = (2, 1).
case of (Nt , Nr ) = (2, 1), R = 1/2, and Gray mapped QPSK,
i.e, m = 2, to illustrate the proposed system. The cases of
other antenna configurations such as Nt = 4 and Nr > 1 will
be discussed in Section III.G.
A. System Description
Fig. 2 shows the schematic diagram of the transmitter and
receiver of the proposed system with (Nt , Nr ) = (2, 1), where
DB is an one-block-unit delay operator. The number of delay
elements in bits within one-block-unit delay is N/2, where
N is the number of bits in each turbo codeword. Like the
original system in [5], our system can transmit N code bits
per block unit and transmit two message bits (or four code
bits) per channel use. With the block-unit delay operator, the
input to the BLAST mapper at the -th block unit is the block
(v̄1, , v̄2, ) = (ū−1 , p̄ ), where ū−1 represents the interleaved
message bits of the (−1)-th turbo codeword and p̄ represents
the interleaved parity bits of the -th turbo codeword. Bits in
v̄j, are used to label the QPSK signals transmitted through
the j-th transmit antenna for j = 1, 2.
The BLAST demapper tries to recover (ū−2 , p̄−1 ),
(ū−1 , p̄ ), and (ū , p̄+1 ), · · · respectively, while the turbo
decoder tries to recover (ū−2 , p̄−2 ), (ū−1 , p̄−1 ), and
(ū , p̄ ), (ū+1 , p̄+1 ), · · · respectively. For the turbo decoder, efficiently decoding (ū , p̄ ) requires the information of
(ū−1 , p̄ ) and (ū , p̄+1 ) passed from the BLAST demapper.
For the BLAST demapper, efficiently demapping (ū−1 , p̄ )
requires the information of (ū−1 , p̄−1 ) and (ū , p̄ ) passed
from the turbo decoder. With the application of delay operation
(or equivalently the convolutional interleaver) to the turbo
encoder and the BLAST mapper, the transmitter output at
all block units are correlated. Similar phenomena can be
observed in MIMO systems using delay diversity [33][34] or
delay diversity in conjunction with channel coding [44][45].
For these systems, maximum likelihood sequence estimator
(MLSE) or maximum likelihood decoding based on Viterbi
decoder are employed to obtain optimum performance. Such
detection or decoding requires the observation of a long
MIMO sequence. For the proposed turbo coded BLAST
system, the decoding trellis for optimum decoding is not
available and the decoding delay will be too long even if
the decoding trellis is available. In the following, we provide
three suboptimum decoding methods, which do not require
the observation of the long sequence. The first is iterative
decoding within a single codeword (IDSC), the second and the
third are both called iterative decoding between adjacent codewords, denoted IDAC-I and IDAC-II respectively. IDSC only
requires the channel observation of (ū−1 , p̄ ) and (ū , p̄+1 )
to decode (ū , p̄ ). IDAC-I requires the channel observation
of (ū−1 , p̄ ), (ū , p̄+1 ) and (ū+1 , p̄+2 ) to decode (ū , p̄ ).
The required channel observation for IDAC-II is similar to
that of IDSC.
B. Iterative Decoding within a Single Codeword (IDSC)
Now we present how to use IDSC to decode (ū , p̄ ) based
on that LD,e (ū−1 ) has been obtained.
Step 1 Through the BLAST demapper, we obtain the a posteriori LLR values, LM,p (p̄ ) computed by (2) with
LM,a (ū−1 ) (or equivalently LD,e (ū−1 )) which is
the LLR obtained in the decoding of the previous
turbo codeword. Note that LM,a (ū−1 ) provides a
half of the a priori LLR values for the demapper.
The other half of a priori LLR values provided by
LM,a (p̄ ) is zero.
Step 2 By a way similar to Step 1, we obtain the a posteriori LLR values, LM,p (ū ) computed by (2) with
LM,a (ū ) = LM,a (p̄+1 ) = 0̄, where 0̄ is the all zero
N
2 -tuple.
Step 3 The turbo decoder uses LD,a (ū ) (or equivalently
LM,e (ū )) and LD,a (p̄ ) (or equivalently LM,e (p̄ ))
as input to yield the a posteriori LLR output values,
LD,p (ū ) and LD,p (p̄ ), after NI iterations within
the turbo decoder.
Step 4 The turbo decoder computes the LLR values in
LD,e (ū ) = LD,p (ū ) − LD,a (ū ), LD,e (p̄ ) =
LD,p (p̄ ) − LD,a (p̄ ) respectively which are then
interleaved to become the LLR values in LM,a (ū )
and LM,a (p̄ ) that are fed into the BLAST demapper.
Step 5 Through the BLAST demapper with the LLR values, LM,a (p̄ ) and LM,a (ū ), obtained in Step
4, LM,a (ū−1 ) (or equivalently LD,e (ū−1 )) obtained in decoding the previous turbo codeword,
and LM,a (p̄+1 )=0̄, we update the a posteriori LLR
values, LM,p (p̄ ) and LM,p (ū ), using (2).
958
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 27, NO. 6, AUGUST 2009
Step 6 Compute LM,e (ū ) = LM,p (ū ) - LM,a (ū ) and
LM,e (p̄ ) = LM,p (p̄ ) - LM,a (p̄ ).
Step 7 If the number of iterations between the turbo decoder
and BLAST demapper is below the maximum limit
NO , then go to step 3. Otherwise, we estimate ū
through decision on LD,p (ū ) and store the LLR values in LD,e (ū ) which will be used in the detection
of next block.
C. Demapper EXIT Curves
Now we investigate the demapper EXIT curves for the proposed system with (Nt , Nr ) = (2, 1). For our system, we must
consider bits in v̄1,+1 = ū and v̄2, = p̄ for the calculation
of demapper EXIT curves. In addition, the calculation is based
on Ia (v̄1, ) = Ia (ū−1 ) = 1 and Ia (v̄2,+1 ) = Ia (p̄+1 ) = x,
0 ≤ x ≤ 1. For IDSC, Ia (v̄2,+1 )=0. Here, we also examine
the effect of Ia (v̄2,+1 ) with various values. In addition, we
use IM,e ((ū , p̄ )) to denote the average of IM,e (ū ) and
IM,e (p̄ ) since the statistics of IM,e (ū ) and IM,e (p̄ ) are
different.
Denote the demapper EXIT function of the proposed system
by TM
, i.e., IM,e =TM
(IM,a ). Suppose that Ia (v̄2,+1 )=1.
For p̄ , we can completely cancel the interference from
ū−1 and obtain LLR values LM,e (p̄ ) by using the assumption of Ia (ū−1 ) = 1, i.e, ū−1 is known. In this
condition, the channel encountered by p̄ can be regarded
as an 1 × 1 channel.
Hence, IM,e (p̄ ) = TM (Nt =
Eb
|h|2 )]. By applying the same
1, Nr = 1) = E[J( 8R N
0
argument
to ū based on Ia (v̄2,+1 )=1, we have IM,e (ū )
Eb
|h|2 )]. Taking the average of IM,e (p̄ ) and
= E[J( 8R N
0
Eb
|h|2 )],
IM,e (ū ), we have IM,e ((ū , p̄ )) = E[J( 8R N
0
which is independent of IM,a ((ū , p̄ )). Hence, the demap
for Ia (v̄2,+1 ) = 1 is horizontal. This
per EXIT curve TM
phenomenon is very encouraging. We need to examine the
condition that Ia (v̄2,+1 ) is less than 1. However, we can not
obtain the close-form representation of IM,e (ū ) and hence
IM,e ((ū , p̄ )) for Ia (v̄2,+1 ) other than 1. Instead, we obtain
IM,e ((ū , p̄ )) through Monte Carlo simulation [5]. Fig. 1
shows TM
for Ia (v̄2,+1 ) = 1, 0.5 and 0 respectively. We
see that TM
is almost horizontal for any investigated value
of Ia (v̄2,+1 ). Hence, introducing delay elements between
the turbo encoder and the BLAST mapper can effectively
flatten the EXIT curves of the BLAST demapper. From
Fig. 1, we also observe that larger Ia (v̄2,+1 ) results in
larger IM,e ((ū , p̄ )). This result fits our intuition that more
information provided by v̄2,+1 which is part of the adjacent
turbo codeword (ū+1 , p̄+1 ) can provide more information
helpful to the decoding of the current turbo codeword (ū , p̄ ).
In the following section, we provide two practical methods for
obtaining information of v̄2,+1 .
D. Iterative Decoding between Adjacent Codewords (IDAC):
IDAC-I and IDAC-II
We now propose two methods of obtaining information of
v̄2,+1 . For the first method, information of v̄2,+1 is obtained
by turbo decoding all the coded bits of the adjacent turbo
codeword, i.e., (p̄+1 , ū+1 ). For the second method, information of v̄2,+1 is obtained by convolutional decoding some
coded bits of (ū+1 , p̄+1 ). The value of Ia (v̄2,+1 ) obtained
by using the first method is in general larger than that obtained
by using the second method. For example, the values of
Ia (v̄2,+1 ) at Eb /N0 = 3.9 dB obtained by using the first and
the second methods are 0.132 and 0.098, respectively. On the
other hand, we can obtain improved LLR values LM,e (v̄2,+1 )
(or equivalently LD,a (v̄2,+1 )) by cancelling the interference
from v̄1,+1 if information of v̄1,+1 , i.e., Ia (v̄1,+1 ) > 0, is
available. Information of v̄1,+1 is obtained by using IDSC to
decode (ū , p̄ ). We can repeat such procedures to gradually
obtain larger values of Ia (v̄1,+1 ) and Ia (v̄2,+1 ) and hence
improve the error performance of (ū , p̄ ). Such a decoding
method is called iterative decoding between adjacent codewords (IDAC). IDAC using the first (second) method to obtain
the information of v̄2,+1 is called IDAC-I (IDAC-II). IDACI can provide better error performance at the cost of longer
decoding delay and higher complexity as compared to IDACII. Based on that LD,e (ū−1 ) has been obtained, IDAC-I is
described as follows.
Step 1 With LD,e (ū ) = 0̄, we use IDSC with NO iterations
to decode (ū+1 , p̄+1 ) and obtain LD,e (p̄+1 ). Note
that, in the demapper, the calculation of LM,p (p̄+1 )
is based on LM,a (ū ) = LM,a (p̄+1 ) = 0̄.
Step 2 With LD,e (ū−1 ) and updated LD,e (p̄+1 ), we use
a modified IDSC with NO iterations to decode (ū ,
p̄ ) and obtain LD,e (ū ) and LD,e (p̄ ), where IDSC
is modified such that at the beginning LM,a (p̄+1 )
(or equivalently LD,e (p̄+1 )) is not zero.
Step 3 With updated LD,e (ū ), we use IDSC with NO
iterations to re-decode (ū+1 , p̄+1 ) and update
LD,e (p̄+1 ).
Step 4 Repeat Step 2 with updated LD,e (p̄+1 ) obtained in
Step 3.
Step 5 After repeating Steps 3 and 4 for NIDAC − 1 times,
we can decode (ū , p̄ ) and obtain LD,e (ū ).
In order that we can apply algorithm IDAC-II, the encoding
must be somewhat modified. We replace the p̄ in Fig. 2 by
the first N/2 code bits of the first rate-2/3 component code,
RSC1, of the turbo code and replace the ū in Fig. 2 by the
other N/2 code bits of the turbo code. In this way, v̄2, is a
codeword of a rate-2/3 convolutional code and can be decoded
by using the trellis of RSC1. Note that v̄2, contains only the
first N/2 bits of RSC1. Based on that LD,e (v̄1, ) has been
obtained, IDAC-II is described as follows.
Step 1 With LM,a (v̄1,+1 ) = LM,a (v̄2,+1 ) = 0̄, we
use the BLAST demapper to obtain the a
posteriori LLR values, LM,p (v̄2,+1 ), according
to (2). The MAP (maximum a posteriori) decoder of RSC1 uses LD,a (v̄2,+1 ) (or equivalently
LM,e (v̄2,+1 )=LM,p (v̄2,+1 )-LM,a (v̄2,+1 )) as input
to yield LD,p (v̄2,+1 ) and LD,e (v̄2,+1 ). The values
in LD,e (v̄2,+1 ) (or equivalently LM,a (v̄2,+1 )) are
fed into the BLAST demapper. After NO iterations
between the MAP decoder of RSC1 and the BLAST
demapper, we obtain the desired LD,e (v̄2,+1 ).
Step 2 With LD,e (v̄1, ) and updated LD,e (v̄2,+1 ), we use a
UENG et al.: TURBO CODED MULTIPLE-ANTENNA SYSTEMS FOR NEAR-CAPACITY PERFORMANCE
959
TABLE I
T HRESHOLDS AND ACHIEVABLE BER FOR THE PROPOSED TURBO CODED
BLAST SYSTEM AND THE ORIGINAL TURBO CODED BLAST SYSTEM IN
[5]. N = 105 IS USED IN THE BER SIMULATION . C APACITIES FOR 2×1
AND 2×2 MIMO SYSTEMS ARE 3.25 D B AND 1.6 D B, RESPECTIVELY.
Restricted IDSC
IDSC
Restricted IDAC-I
IDAC-I
Restricted IDAC-II
IDAC-II
Restricted IDSC
IDSC
Restricted IDAC-I
IDAC-I
Restricted IDAC-II
IDAC-II
Proposed 2x1 system
Error performance
Eb/No(dB)
BER
4.21
3.13 × 10−5
4.12
8.37 × 10−7
3.88
7.90 × 10−7
3.86
7.40 × 10−7
4.05
1.24 × 10−6
4.01
1.61 × 10−6
Original 2x1 system
Threshold
Error performance
(dB)
Eb/No(dB)
BER
4.4
5.00
1.20 × 10−6
Threshold
(dB)
3.90
3.90
3.67
3.60
3.73
3.64
Proposed 2x2 system
Error performance
Eb/No(dB)
BER
2.3
6.97 × 10−5
2.28
3.43 × 10−5
2.11
3.59 × 10−6
2.105
4.61 × 10−6
2.26
1.53 × 10−5
2.25
3.37 × 10−5
Original 2x1 system
Threshold
Error performance
(dB)
Eb/No(dB)
BER
2.1
2.42
5.23 × 10−5
Threshold
(dB)
1.99
1.99
1.88
1.86
1.98
1.94
modified IDSC with NO iterations to decode (v̄1,+1 ,
v̄2, ) and obtain LD,e (v̄1,+1 ) and LD,e (v̄2, ), where
IDSC is modified such that at the beginning
LM,a (v̄2,+1 ) (or equivalently LD,e (v̄2,+1 )) is not
zero.
Step 3 Repeat Step 1 with updated LD,e (v̄1,+1 ) (or equivalently LM,a (v̄1,+1 )).
Step 4 Repeat Step 2 with updated LD,e (v̄2,+1 ).
Step 5 After repeating Steps 3 and 4 for NIDAC − 1 times,
we can decode (v̄1,+1 , v̄2, ) and obtain LD,e (v̄1,+1 )
and LD,p (ū ).
In summary, LM,a (v̄2,+1 ) is obtained by turbo decoding
of (v̄1,+2 , v̄2,+1 )=(ū+1, p̄+1 ) in IDAC-I while in IDAC-II,
LM,a (v̄2,+1 ) is obtained by MAP decoding of the convolutional code v̄2,+1 . The advantage of IDAC-II over IDACI is that in decoding (ū , p̄ ), there is no need to refer to
the channel output block containing v̄1,+2 . Throughout this
paper, we use NI = 20, NO = 3, and NIDAC = 3, unless the
parameters are otherwise specified.
Fig. 3. EXIT curves of (ū , p̄ ) and (ū+1 , p̄+1 ) for the proposed turbo
Eb
= 3.64 dB.
coded BLAST system using IDAC-I. Nt = 2, Nr = 1, N
0
between (ū , p̄ ) and (ū+1 , p̄+1 ). In fact, we only consider
the information exchange between bits in v̄1,+1 = ū and
v̄2,+1 = p̄+1 since v̄1,+1 and v̄2,+1 are correlated by the
BLAST mapper. The EXIT charts of (ū , p̄ ) using IDACI are calculated based on Ia (v̄1, ) = 1 , Ia (v̄2,+2 ) = 0,
and N =2097152. We run IDSC on (ū , p̄ ) using 20 iterations within the turbo decoder, i.e, NI = 20 to obtain
Ie1 (v̄1,+1 ) for various Ia1 (v̄2,+1 ). Similarly, we run IDSC on
(ū+1 , p̄+1 ) with NI = 20 to obtain Ie2 (v̄2,+1 ) for various
Ia2 (v̄1,+1 ). Fig. 3 shows the EXIT charts for our system
using IDAC-I with 3 iterations between the turbo decoder
and the demapper, i.e., NO = 3. Following a similar method,
we can obtain the EXIT charts for our system using IDAC-II
which are similar to those of the case of IDAC-I. From Table
I, we see that our system can achieve better thresholds as
compared to the original system. For example, the threshold
of our system using IDAC-I is 3.60 dB while the threshold of
the original system is 4.4 dB. Note that the capacity of the
2 × 1 MIMO system is at Eb /No = 3.25 dB.
Since the EXIT curves of the MIMO detectors of our MIMO
systems are close-to-horizontal, we also investigate a reducedcomplexity version of IDSC for which the soft output of the
turbo decoder is not used to update the LLR output of the
MIMO demapper. This reduced-complexity version of IDSC
is called restricted IDSC. In restricted IDSC, Steps 5 and 6
described in Section III.B are skipped and we use NO = 1
only. We can employ restricted IDSC in IDAC-I to obtain a
reduced-complexity version of IDAC-I called restricted IDACI. Similarly, we can obtain restricted IDAC-II. Also included
in Table I are the thresholds of our proposed system using
restricted IDSC, restricted IDAC-I and restricted IDAC-II. We
see that our system using these restricted versions can achieve
satisfactory thresholds.
E. Thresholds for Various Detection-Decoding Algorithms
Table I summarizes the thresholds of the proposed system
using IDSC, IDAC-I or IDAC-II and the original system. For
IDSC, we consider information exchange between the demapper and the decoder based on the condition of Ia (v̄1, ) = 1
and Ia (v̄2,+1 ) = 0. This case has been discussed in Section
III.C. For IDAC-I, we need to consider information exchange
F. BER Results for Various Detection-Decoding Algorithms
BER results for the proposed system and original system
with long interleavers are shown in Fig. 4. Remember that
throughout this paper, we use NI = 20, NO = 3, and NIDAC
= 3, unless the parameters are otherwise specified. The results
of achievable BER are summarized in Table I. To match
960
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 27, NO. 6, AUGUST 2009
Fig. 4. BER of the proposed turbo coded BLAST system (N = 105 ) and the
original turbo coded BLAST system in [5] (N = 3 × 105 ). Nt = 2, Nr = 1.
the thresholds derived by EXIT charts, we can replace the
interleavers Π1 and Π2 of our system in Fig. 2 by a single
interleaver Π which permutes all the turbo coded bits including
ū and p̄ . However, from BER results, we find that our system
using Π1 and Π2 achieves a slightly better error performance
as compared to our system using Π. Hence, in the following,
we use the transmitter shown in Fig. 2 for our system.
Our system using the restricted version will result in only
slightly worse error performance as compared to not using the
restricted version. In contrast, the error performance of the
original system in [5] with updating demapper is significantly
better than that of not updating demapper since the slope of
TM (Nt = 2, Nr = 1) is not flat and LM,e ((ū , p̄ )) can be improved by LM,a ((ū , p̄ )) which can be obtained by complex
iterative demapping and decoding between the decoder and
demapper. Hence, in the following, for the proposed system,
we consider only the restricted version in the detection and
decoding. However, for the original system in [5], we still
use the detection and decoding exactly the same as the one
provided in Section II.
BER results for the proposed system and the original system
with short interleavers are shown in Fig. 5. The decoding
delays of our system using IDSC, IDAC-I, and IDAC-II
are 2N , 3N , and 2N code bits, respectively. From Fig. 4
and Fig. 5, we see that IDAC-II can provide better error
performance at the cost of higher complexity as compared
to IDSC, while IDAC-I can provide better error performance
at the cost of longer decoding delay and higher complexity as
compared to IDAC-II. For a fair comparison based on the same
decoding delay, we may compare the proposed system using
IDAC-I to the original system with triple interleaver size. We
see that the proposed system is superior to the original system
at low SNR, which results from the superior behavior of
convergence capability of our system as indicated in the EXIT
charts. In particular, the superiority is very significant for the
cases of long interleavers. With the triple interleaver size, the
original system will have slightly better error performance at
the error floor region as compared to the proposed system.
Compared to the LDPC codes and RA codes optimized for
multiple-antenna systems given in [8] and [9], respectively,
Fig. 5. BER of the proposed turbo coded BLAST system using restricted
IDSC, IDAC-I, and IDAC-II, and the original turbo coded BLAST system in
[5]. Nt = 2, Nr = 1,
our turbo coded BLAST system can have similar thresholds.
In addition, our system can achieve slightly better BER
performance. Taking the case of (Nt , Nr ) = (2, 1) as example,
we find that our system using N = 105 and restricted IDAC-I
can achieve a BER of 7.90 × 10−7 at Eb /N0 = 3.88 dB while
the LDPC and RA coded BLAST systems using N = 105
can achieve a BER of 10−4 at Eb /N0 = 4.0 dB and 3.95 dB,
respectively.
G. Extensions to Other MIMO Configurations
The thresholds and achievable BER for (Nt , Nr ) = (2, 2)
are also given in Table I. As compared to the original system,
the proposed system provides only slightly better thresholds.
The reason can be easily explained by the associated EXIT
curves which are not given here. Although TM (Nt = 2, Nr =
2) is not close-to-horizontal, the slope of TM (Nt = 2, Nr =
2) is lower than that of TM (Nt = 2, Nr = 1). Hence, the room
of improvement in threshold by using delay elements for the
case of (Nt , Nr ) = (2, 2) is not as large as the case of (Nt , Nr )
= (2, 1). Compared to the IRA codes optimized for multipleantenna systems given in [7], our system can achieve similar
BER performance and threshold. From Table I, we find that
our system using restricted IDAC-I and N = 105 can achieve
a BER of 3.59 × 10−6 at Eb /No = 2.11 dB while the IRA
coded MIMO in [7] with N = 2 × 105 can achieve similar
BER at Eb /No = 2.2 dB. In addition, the thresholds for our
system using restricted IDAC-I and IRA coded MIMO in [7]
are 1.88 dB and 1.9 dB, respectively. Note that the capacity
of the 2 × 2 MIMO system is at Eb /No = 1.6 dB.
The transmitter for our system with Nt = 4 is similar
to the case of Nt = 2 except for some differences. Now,
we have K = mNt = 8 and q = N/K = N/8. In addition,
the number of delay elements in bits within one-block-unit
delay equals N/4. Equivalently, there are q = N/8 MIMO
transmissions within one block unit. The turbo coded bits are
divided into four streams for the four transmit antennas. Bits in
both the first stream and the second stream are delayed by N/4
code bits, before being fed to the transmitted antennas. The
resultant system is called Type-I system. From the analysis
UENG et al.: TURBO CODED MULTIPLE-ANTENNA SYSTEMS FOR NEAR-CAPACITY PERFORMANCE
Binary
source
u
961
Interleaver
"
Turbo
Encoder p"
Ȇ
Demultiplexer
v1, "
v2,"
v3,"
8PSK
Signal
Mapper
STBC
Encoder
8PSK
Signal
Demapper
STBC
Decoder
Channel
Decoded
Output
Turbo
Decoder
Deinterleaver
-1
Ȇ
Ȇ
Multiplexer
Demultiplexer
Fig. 7. Transmitter and receiver of a turbo coded Alamouti STBC system
in [29].
STBC [10], as inner codes. Therefore, we have Nt = 2. We
will increase the constellation size and the rate of turbo code
to compensate the rate loss due to the inner Alamouti STBC.
Fig. 6. BER of the proposed turbo coded BLAST systems using restricted
IDSC (N = 105 ) and the original turbo coded BLAST system in [5] (N =
3 × 105 ). Nt = 4, Nr = 1.
of EXIT charts for (Nt , Nr ) = (4, 1) with R = 1/2, we see
that the threshold of Type-I system using restricted IDSC is at
Eb /N0 = 10.4 dB while the threshold of the original system
is at Eb /N0 = 11.7 dB. Note that the capacity of such a
MIMO system is 6.65 dB. Although the threshold of TypeI system is better than that of the original system for the
case of (Nt , Nr ) = (4, 1), the threshold of Type-I system
is still somewhat distant from the capacity. The reason can
be easily explained by the associated EXIT curves, which
are not given here. In case of (Nt , Nr ) = (4, 1), although
the detector EXIT curve of Type-I system has been flattened
as compared to that of the conventional system, the detector
EXIT curve of Type-I system is not close to horizontal as
in the case of (Nt , Nr ) = (2, 1). Hence, the EXIT curve of
the detector with (Nt , Nr ) = (4, 1) can not match the closeto-horizontal EXIT curve of the turbo decoder well. We can
further flatten the detector EXIT curve with (Nt , Nr ) = (4, 1)
by introducing additional delay elements between the turbo
encoder and the BLAST mapper as follows. The turbo coded
bits associated with the j-th transmitted antenna are delayed
by (4 − j)N/4 bits before being fed to the j-th transmitted
antenna for j = 1, 2, 3, 4. The resultant system is called
Type-II system. The resultant detector EXIT curves under
the conditions of Ia (v̄2,+3 ) = Ia (v̄3,+2 ) = Ia (v̄4,+1 ) = x,
Ia (v̄3,+3 ) = Ia (v̄4,+3 ) = Ia (v̄4,+2 ) = 0, and Ia (v̄1, ) =
Ia (v̄2, ) = Ia (v̄3, ) = Ia (v̄1,+1 ) = Ia (v̄2,+1 ) = Ia (v̄1,+2 ) = 1
are all close-to-horizontal for the investigated cases of x=0,
0.5, and 1 respectively and hence can match the decoder
EXIT curve well. The threshold of Type-II system using
(Nt , Nr ) = (4, 1), restricted IDSC, i.e., x = 0, is 7.4 dB
which is closer to the capacity as compared to Type-I system.
The BER results shown in Fig. 6 verify the prediction obtained
by the analysis of EXIT charts.
IV. T URBO C ODED A LAMOUTI STBC S YSTEMS
Since the Alamouti STBC has the advantage of low detector
complexity, Alamouti STBC in conjunction with the channel
coding is of practical significance as well. Now we investigate
turbo coded multiple-antenna systems using the Alamouti
A. A turbo coded Alamouti STBC system
Let xi be a complex number representing a constellation
point. The output of the Alamouti STBC encoder at antennas
1 and 2 for the input (x0 , x1 ) are (x0 , −x∗1 ) and (x1 , x∗0 ),
respectively. The channel model is the same as that described
in Section II.B except that the fading coefficients are constant
over two consecutive MIMO transmissions and change independently from every two MIMO transmissions. For Nr = 1,
the received signals at time instants 0 and 1 respectively
denoted by r0 and r1 are [r0 , r1 ] = [h0 , h1 ]G+[n0 , n1 ], where
h0 and h1 are channel coefficients from transmit
antennas
1
x0 −x∗1
and 2 to the receive antenna, respectively, G =
,
x1 x∗0
and n0 and n1 are the AWGN at time instants 0 and 1
respectively. By utilizing the orthogonality of G, it is easy
to verify that the equivalent channel model is given as
z0 = h∗0 r0 + h1 r1∗ = (|h0 |2 + |h1 |2 )x0 + h∗0 n0 + h1 n∗1
z1 = h∗1 r0 − h0 r1∗ = (|h0 |2 + |h1 |2 )x1 − h0 n∗1 + h∗1 n0 .
(4)
According to (4), the symbols transmitted from two antennas can be separated at the receiver and hence the Alamouti
STBC plays a key role of transforming a 2 × 1 channel into
an 1 × 1 channel.
The transmitter and receiver of the turbo coded Alamouti
STBC system in [29] are shown in Fig. 7, where a rate-2/3
turbo code, an 8PSK signal mapper, and the Alamouti STBC
are employed. Through such a system, we can transmit 2 bits
per MIMO transmission. The rate-2/3 turbo code is obtained
by uniformly puncturing the parity bits of a rate-1/3 turbo
code for which the generator matrix of the constituent codes
is (1, 5/7)8 . At the transmitter, we first de-multiplex the -th
interleaved turbo codeword (ū , p̄ ) of size of N = 3w/2 bits
into three sub-blocks v̄1, , v̄2, , and v̄3, of equal size of w/2
bits before been fed into the 8PSK signal mapper. At the -th
block unit, bits in v̄1, , v̄2, , and v̄3, are used as the first-level,
second-level, and third-level labeling bits of the 8PSK signals,
respectively. We can apply the turbo principle to this system
by iteratively performing decoding and detection between the
turbo decoder and the MIMO detector which consists of the
Alamouti STBC decoder and a MAP signal demapper. Due
to the orthogonality of G, the output of the STBC decoder is
962
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 27, NO. 6, AUGUST 2009
0.9
0.85
IM,e(ul,pl) [ID,a(ul,pl)]
0.8
0.75
0.7
0.65
Detector, propos ed, mixed labeling, x = 0
0.6
Detector, propos ed, mixed labeling, x = 0.5
0.55
Detector, propos ed, mixed labeling, x = 1
Detector, original, Gray labeling
0.5
Detector, original, mixed labeling
0.45
0.4
Decoder
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
IM,a(ul,pl) [ID,e(ul,pl)]
Eb
Fig. 8. EXIT curves of MIMO detector ( N
= 5.0 dB) and turbo decoder
0
for the proposed turbo coded Alamouti STBC system and an original turbo
coded Alamouti STBC system in [29]. (Nt , Nr ) = (2, 1).
not updated after sending z0 and z1 to the signal demapper.
In other words, there is no loop between the STBC decoder
and the turbo decoder.
Fig. 8 shows the detector EXIT curves of the STBC system
in [29] which is also illustrated in Fig. 7. Both the Gray
labeling and mixed labeling in [16] are investigated. With
mixed labeling, the eight sequential labels for 8PSK signal
points are {000, 100, 010, 110, 011, 111, 001, 101}. Also
included in Fig. 8 is the decoder EXIT curve of the rate-2/3
turbo code. We see that the detector EXIT curve with Gray
labeling is almost flat and can match the close-to-horizontal
EXIT curve of the turbo decoder. However, the detector EXIT
curve with mixed labeling is not flat and can not match well
with the decoder EXIT curve. Hence, this STBC system with
Gray labeling can achieve a lower threshold as compared to
the same STBC system using mixed labeling. On the other
hand, we see that using mixed labeling can achieve a lower
error floor as compared to using Gray labeling from the BER
results shown in Fig. 9, where NO = 3 and NO = 5 are
respectively used in the cases of Gray labeling and mixed
labeling for near-convergence performance. The reasoning is
similar to the argument for comparing the BICM using Gray
labelling and the BICM using the mixed labelling over the
Rayleigh fading channel [40] [41] [17]. In the following, we
propose a turbo coded Alamouti STBC system with mixed
labelling which can achieve not only a lower threshold but also
a lower error floor as compared to the original turbo coded
Alamouti STBC system with Gray labelling.
B. Proposed turbo coded Alamouti STBC system
The mixed labelling for 8PSK has the characteristic that it
can be partitioned into two Gray-mapped QPSK constellations
indexed by the first labelling bit. In other words, the four signal
points of the 8PSK constellation with the same first labelling
bit can be regarded as a Gray-mapped QPSK constellation.
Based on this characteristic, the implementation of the proposed system is the same as the original system in Fig. 7
using mixed labelling except that we delay v̄1, by w/2 bits
Fig. 9. BER of the proposed turbo coded Alamouti STBC system using
restricted IDSC and IDAC-I (N = 105 ) and an original turbo coded Alamouti
STBC system in [29] (N = 3 × 105 ). Nt = 2, Nr = 1.
before been fed into the 8PSK signal mapper. At the -th block
unit, bits in v̄1,−1 , v̄2, , and v̄3, are used as the first-level,
second-level, and third-level labelling bits of the 8PSK signals,
respectively. For our system, the design of the interleaver and
demultiplexer may somewhat affect the error performance. In
this paper, we consider the design such that all the bits in v̄1,
and v̄2, are message bits and all the bits in v̄3, are parity bits
of the turbo code. With the introduction of the delay elements,
the minimum symbol-wise Hamming distance of our system
is likely to be greater than that of the original system using
mixed labelling. Note that the symbol-wise Hamming distance
and squared product distance determine the diversity order and
coding gain, respectively, for fixed values of Nt and Nr [41].
The symbol-wise Hamming distance plays a more important
role in the performance at the error-floor region as compared to
the squared product distance. Hence, our system may achieve
a lower error floor as compared to the original system using
either mixed or Gray labelling. Using an approach similar to
the proposed BLAST system, we can obtain the detector EXIT
curves of our turbo coded Alamouti STBC system based on
the conditions of Ia (v̄1,−1 ) = 1 and Ia (v̄2,+1 ) = Ia (v̄3,+1 )
= x, 0 ≤ x ≤ 1. The resultant detector EXIT curves are also
shown in Fig. 8. We find that the detector EXIT curves are
close to horizontal and hence can match the decoder EXIT
curve well. In addition, we find that the close-to-horizontal
detector EXIT curve for the proposed STBC system lying
above the close-to-horizontal detector EXIT curve for the
original STBC system in Fig. 7 using Gray labelling even for
the worst case of Ia (v̄2,+1 ) = Ia (v̄3,+1 ) = 0. This implies
that there is room of performance improvement which can be
obtained from improving the decoding of our STBC system
by employing the information passed from the adjacent turbo
codeword. Such kind of performance improvement can not be
obtained by simply extending the code length of the original
STBC system. Hence, the proposed STBC system may achieve
a better threshold as compared to the original STBC system
if the iterative decoding for the proposed system is properly
designed.
UENG et al.: TURBO CODED MULTIPLE-ANTENNA SYSTEMS FOR NEAR-CAPACITY PERFORMANCE
Like the proposed BLAST system, we can use IDSC, IDACI, or IDAC-II to decode the proposed turbo coded Alamouti
STBC system since the introduction of the delay elements. For
simplicity of presentation, only the results of using restricted
IDSC and restricted IDAC-I are presented here. Based on the
assumption of Ia (v̄1,−1 ) = 1, we can obtain the associated
thresholds of our proposed STBC system by a way similar
to the proposed BLAST system. The thresholds of our STBC
system using restricted IDSC and restricted IDAC-I are 4.59
dB and 4.45 dB, respectively, which are lower than the
thresholds of the original STBC system in Fig. 7 (4.76 dB and
4.80 dB respectively for Gray labeling and mixed labeling).
We see that using restricted IDAC-I, our STBC system can
achieve BER = 1.67 × 10−8 at Eb /N0 of 4.66 dB, which is
lower than the thresholds of the original STBC system using
Gray labeling and mixed labeling. This verifies the claim “This
implies · · · extending the code length of the original STBC
system.” given at the last paragraph. From the BER results
shown in Fig. 9, we see that the advantage of introducing
block delay is obvious.
V. C ONCLUSIONS
We introduce a convolutional interleaver with block delay
into the turbo coded BLAST system and the turbo coded
Alamouti STBC system. The EXIT analysis shows that the
block delay helps us flatten the detector EXIT curves and
hence can match well with the EXIT curve of the turbo
decoder. We devise various decoding algorithms with various complexities and error performances. Using turbo coded
BLAST with a proper delay design, we can obtain thresholds
very close to the capacity for Nt = 2, Nr = 1 and Nt = 2,
Nr = 2, respectively. For Nt = 4, Nr = 1, we see that using
turbo coded BLAST with various delay designs, we can lower
the thresholds by various degrees as compared to not using
the delay design. Using turbo coded Alamouti STBC with a
proper delay design, we can obtain thresholds and error floors
better than those of the original turbo coded Alamouti STBC
without the delay design for Nt = 2, Nr = 1.
ACKNOWLEDGMENT
The authors are very grateful to the reviewers who provided valuable comments and suggestions which significantly
enhance the quality of this paper.
R EFERENCES
[1] G. J. Foschini, “Layered space-time architecture for wireless communications in a fading environment when using multi-element antennas,” Bell
Labs Tech. Journal, pp.41-59, Autumn 1996.
[2] M. Sellathurai and S. Haykin, “TURBO-BLAST for high speed wireless
communications,” in Proc. IEEE Wireless Comm. and Nework Conf.,
2000, WCNC 2000, Sept. 2000, Chicago.
[3] A. van Zelst, R. van Nee, and G.A. Awater, “Turbo-BLAST and its
performance,” Proc. Vehicular Tech. Conf., vol 2, May 2001.
[4] A. Stefanov and T. M. Duman, “Turbo coded modulation for systems
with transmit and receive antenna diversity over block fading channels:
system models, decoding approaches, and practical considerations,” IEEE
J. Select. Areas Commun., vol. 19, pp. 958-968, May 2001.
[5] S. ten Brink and B. M. Hochwald, “Detection thresholds of iterative
MIMO processing,” in Proc. IEEE Int. Symp. Inf. Theory, Lausanne,
Switzerland, June 2002, p.22.
963
[6] B. M. Hochwald and S. ten Brink, “Achieving near-capacity on a
multiple-antenna channel,” IEEE Trans. Commun., vol. 51, No. 3,
pp. 389–399, March 2003.
[7] G. Yue and X. Wang, “Optimization of irregular repeat-accumulate
codes for MIMO systems with iterative receivers,” IEEE Trans. Wireless
Commun., vol. 4, No. 6, pp. 2843–2855, Nov. 2005.
[8] S. ten Brink, G. Kramer, and A. Ashikhmin, “Design of low-density
parity-check codes for modulation and detection,” IEEE Trans. Commun.,
vol. 52, No. 4, pp. 670–678, April 2004.
[9] S. ten Brink and G. Kramer, “Design of repeat-accumulate codes for
iterative detection and decoding,” IEEE Trans. Signal Processing, vol.
51, No. 11, pp. 2764–2772, Nov. 2003.
[10] S. M. Alamouti, “A simple transmitter diversity scheme for wireless
communications,” IEEE J. Select. Areas Commun., vol. 16, pp. 14511458, Oct. 1998.
[11] V. Tarokh, H. Jafarkhani, and A. R. Calderbank, “Space-time block
codes from orthogonal designs,” IEEE Trans. Inform. Theory, vol. 49,
pp. 1456-1467, July 1999.
[12] V. Tarokh, N. Seshadri, and A. R. Calderbank, “Space-time codes for
high data rate wireless communication: performance criterion and code
construction,” IEEE Trans. Inform. Theory, vol. 44, pp. 744-765, March
1998.
[13] G. Ungerboeck, “Channel coding with multilevel/phase signals,” IEEE
Trans. Inform. Theory, vol. 28, pp. 55–66, Jan. 1982.
[14] E. Zehavi, “8-PSK trellis codes for a Rayleigh channel,” IEEE Trans.
Inform. Theory, vol. 40, pp. 873-884, May 1992.
[15] G. Caire, G. Taricco, and E. Biglieri, “Bit-interleaved coded modulation,” IEEE Trans. Inform. Theory, vol. 44, pp. 927–946, May 1998.
[16] X. Li and A. Ritcey, “Bit-interleaved coded modulation with iterative
decoding,” IEEE Commun. Lett., vol. 1, no. 6, pp. 77–79, Nov. 1997.
[17] X. Li and A. Ritcey, “Turbo trellis coded modulation with bit interleaving and iterative decoding,” IEEE J. Select. Areas Commun., vol. 17,
no. 4, pp. 715–724, April 1999.
[18] Y. L. Ueng, C. J. Yeh, and M. C. Lin, “On trellis codes with delay
processor and signal mapper,” IEEE Trans. Commun., vol. 50, pp. 19061917, Dec. 2002.
[19] S. Le Goff, A. Glavieux, and C. Berrou, “Turbo-codes and high spectral
efficient modulation,” in Proc. ICC’94, pp. 1064-1070, 1994.
[20] P. Robertson and T. Wörz, “Bandwidth-efficient turbo trellis coded
modulation using punctured component codes,” IEEE J. Select. Areas
Commun., vol. 16, pp. 206–218, Feb. 1998.
[21] S. Benedetto, D. Divsalar, G. Montorsi, and F. Pollara, “Parallel concatenated trellis coded modulation,” in IEEE Conf. on Commun., pp. 974–978,
1996.
[22] C. Fragouli and R. D. Wesel, ”Turbo-encoder design for symbolinterleaved parallel concatenated trellis coded modulation,” IEEE Trans.
Commun., vol. 49, no. 3, pp. 425–435, March 2001.
[23] Li Ping, B. Bai, and X. Wang, “Low-complexity concatenated two-state
TCM schemes with near-capacity performance,” IEEE Trans. Inform.
Theory, vol. 49, no. 12, pp. 3225–3234, Dec. 2003.
[24] D. Sridhara and T. E. Fuja, “LDPC codes over rings for PSK modulation,” IEEE Trans. Inform. Theory, vol. 51, no. 9, pp. 3209–3220, Sept.
2005.
[25] S.M. Alamouti, V. Tarokh, and P. Poon, “Trellis coded modulation and
transmit diversity: design criteria and performance evaluation,” in Proc.
IEEE ICUPC’98, pp. 703-707, Oct. 1998.
[26] Y. Gong and K. B. Letaief, “Concatenated space-time block coding
with trellis coded modulation in fading channels,” IEEE Trans. Wireless
Commun., vol. 1, No. 4, pp. 580-590, Dec. 2002.
[27] Z. Hong and B. Hughes, “Bit-interleaved space-time coded modulation
with iterative decoding,” IEEE Trans. Wireless Commun., vol. 3, No. 6,
pp. 1912-1917, Nov. 2004.
[28] Y. L. Ueng, Y. L. Wu, and R. Y. Wei, “Concatenated spece-time block
coding with trellis coded modulation using a delay processor,” IEEE
Trans. Wireless Commun., vol. 6, no. 12, pp. 4452-4463, Dec. 2007.
[29] G. Bauch, “Concatenation of space-time block codes and ”turbo”-TCM,”
in Proc. IEEE ICC’99, pp. 1202-1206, 1-10 June, 1999.
[30] T.H. Liew, J. Pliquett, B.L. Yeap, L-L. Yang, and L. Hanzo, ”Comparative study of space time block codes and various concatenated turbo
coding schemes,” in Proc. PIMRC, vol. 1, pp. 741-745, 2000.
[31] J. Hou, P. H. Siegel, and L. B. Milstein, “Design of multi-input multioutput systems based on low-density parity-check codes, ” IEEE Trans.
Commun., vol. 53, No. 4, pp. 601–611, April 2005.
[32] S. ten Brink, “Convergence behavior of iterative decoded parallel
concatenated codes,” IEEE Trans. Commun., vol. 49, no. 10, pp. 17271737, Oct. 2001.
964
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 27, NO. 6, AUGUST 2009
[33] A. Wittneben, “Base station modulation diversity for digital SIMULCAST,” in Proc. Vehicular Technology Conf., vol. 1, pp. 848-853, May
1991.
[34] J. H. Winters, “The diversity gain of transmit diversity in wireless
systems with Rayleigh fading,” IEEE Trans. Veh. Technol., vol. 47, pp.
119-123, Feb. 1998.
[35] A. Ashikhmin, G. Kramer, and S. ten Brink, “Extrinsic information
transfer fuctions: model and erasure channel properties,” IEEE Trans.
Inform. Theory, vol. 50, No. 11, pp. 2657–2673, Nov. 2004.
[36] S. ten Brink, “Space-time turbo coding,” a Chapter in Space-Time
Wireless Systems, H. Bölcskei, D. Gesbert, C. B. Papadias, and A.-J.
van der Veen, pp. 322–341, Cambridge Univ. Press Nov. 2006.
[37] J. L. Ramsey, “Realization of optimum interleavers,” IEEE Trans.
Inform. Theory, vol. IT16, pp. 338-345, 1970.
[38] T.J. Richardson, M.A. Shokrollahi, and R.L. Urbanke, “Design of
capacity-approaching irregular low-density parity-check codes,” IEEE
Trans. Inform. Theory, vol. 47, no. 2, pp. 619–637, Feb. 2001.
[39] H. Jin, A. Khandekar, and R. J. McEliece, “Irregular repeat-accumulate
codes,” in Proc. Int. Symp. Turbo Codes and Related Topics, Brest,
France, Sept. 2000, pp. 1–8.
[40] D. Divsalar and M.K. Simon, “The design of trellis coded MPSK for
fading channels: performance criteria,” IEEE Trans. Commun., vol. 36,
pp. 1004-1012, Sept. 1988.
[41] X.N. Zeng and A. Ghrayeb, “Performance bounds for combined channel
coding and space-time block coding with receive antenna selection,” IEEE
Trans. Veh. Tech., vol. 55, no. 4, pp. 1441–1446, 2006.
[42] J. Suh and M.M.K. Howlader, ”Design schemes of space-time block
codes concatenated with turbo codes,” in Proc. IEEE 55th Vehicular Tech.
Conf., pp. 1030-1034, vol. 2, Spring. 2002.
[43] A. Sezgin, D. Wubben, R. Böhnke and V. Kühn, ”On EXIT-charts for
space-time block codes,” in Proc. IEEE Inter. Symp. on Inf. Theory 2003,
Yokohama, Japan, June 29-July 4, 2003.
[44] T.A. Narayanan and B.S. Rajan, ”A general construction of space-time
trellis codes for PSK signal sets,” in Proc. IEEE Global Telecommun.
Conf., San Francisco, USA, pp. 1978-1983, vol. 4, Dec. 2003.
[45] M. Tao and R. S. Cheng, ”Diagonal block space-time code design for
diversity and coding advantage over flat fading channels,” in IEEE Trans.
Signal Processing, vol. 52, no. 4, pp.1012-1020, April 2004.
Yeong-Luh Ueng received the B.S. degree in electrical engineering from National Taiwan University,
Taipei, Taiwan, R.O.C., in 1997. At the same university, he received the M.S. and Ph.D. degrees
in communication engineering from the Graduate
Institute of Communication Engineering in 1999 and
2001, respectively.
From 2001 to 2005, he was with a private company in Taiwan and focused on the design and
development of various wireless chips including RF
transceiver chips and Bluetooth and PHS baseband
chips. Since December 2005, he has joined the faculty of National Tsing-Hua
University, Hsinchu, Taiwan, where he is currently an Assistant Professor in
the Department of Electrical Engineering and the Institute of Communications
Engineering. His research interests include coding theory, wireless communications, and communication IC.
He was elected an honorary member of the Phi Tau Phi Scholastic Honor
Society. He is a member of IEEE.
Chia-Jung Yeh received the B.S. degree in electrical engineering from National Taiwan University,
Taipei, Taiwan, R.O.C., in 1990. At the same university, he received the M.S. and Ph.D. degrees
in communication engineering from the Graduate
Institute of Communication Engineering in 2000 and
2009, respectively.
He is the chief of the Information Management
Office in National Education Radio, Taipei, Taiwan,
Republic of China. His research interests include
coding and communications theory.
Mao-Chao Lin was born in Taipei, Taiwan, Republic of China, on December 24, 1954.
He received the Bachelor and Master degree,
both in electrical engineering, from National Taiwan
University in 1977 and 1979, respectively. He also
received the Ph.D. degree in electrical engineering
from University of Hawaii in 1986.
From 1979 to 1982, he was an assistant scientist
of Chung-Shan Institute of Science and Technology
at Lung-Tan, Taiwan. He is currently a Professor at
the Department of Electrical Engineering, National
Taiwan University. His research interests are in the area of coding theory and
its applications.
Chung-Li Wang received his B.S. in the Department of Electrical Engineering from National Tsing Hua University, Hsinchu, Taiwan, in 2004, and
his M.S. in the Graduate Institute of Communication Engineering from National Taiwan University,
Taipei, Taiwan in 2006. From 2006 to 2008, he
was a Ph.D. student in the Department of Electrical
Engineering, University of Hawaii, Manoa, Honolulu, and is now in the Department of Electrical
and Computer Engineering, University of California,
Davis, U.S.A.
His research includes modern coding theory and wireless communications.