Proceedings of the 2007 IEEE International Conference on Telecommunications and Malaysia International Conference on Communications, 14-17 May 2007, Penang, Malaysia Configurable Adaptive Viterbi Decoder for GPRS , EDGE and Wimax Mohamed Farid Noor Batcha Mimos Berhad Malaysia Email:[email protected] capacity. This is bounded by regulations which limit the channel capacity in terms of signal power and bandwidth. With such limitations it is realized that with the introduction of channel coding, signals can be transmitted with lower power and still be error free. Bandwidth could be of waste if at good channel a system is running using a code rate 1/3 encoder while a code rate 1/2 would be sufficient. This paper evaluates the joint architecture of a constraint length 5 and 7 with both rates 1/2 and 1/3 Viterbi decoder and also some performance were monitored with few channel models to show the advantages of the different Viterbi decoder under the different channel conditions. The efficient implementation of the joint Viterbi architecture was then prototyped on an FPGA and system verification was done. Abstract – Error correction codes are used widely in all wireless communication systems to reduce data corruption. The most widely used decoding algorithm is the Viterbi decoder which is used with different parameters for different standards requirements. This paper analyses the different Viterbi decoders and implements a reconfigurable adaptive Viterbi decoder for GPRS, EDGE and Wimax technologies. The high performance generic soft input hard output Viterbi decoder is prototyped on a FPGA. 1.0 INTRODUCTION Channel coding is required in digital communications over noisy channels to maximize bit-error rate performance and throughput. Under multipath fading conditions, the coding scheme must be strong enough to cater for random as well as bursty errors. Various coding schemes are used in the wireless packet data network of GPRS, EDGE and Wimax to maximize channel capacity. GPRS uses a Constraint length 5 and rate 1/2 Viterbi decoder, while EDGE uses a constraint length 7 rate 1/3 with both tail biting and zero tail Viterbi decoding [1][2]. The tail biting Viterbi decoder is used on the header portion of the Viterbi decoder while the zero tail Viterbi is used on the data portion. Wimax 802.16e currently has the Viterbi with constraint length 7 and rate ½ with tail biting as mandatory and zero tail as optional [3]. The complexity of the Viterbi decoder grows exponentially with the increase of constraint length. It is expensive in terms of area size for EDGE systems to have two separate engines, to support both the GPRS and EDGE Viterbi decoder as EDGE still requires backward compatibility to GPRS. Several ideas of generic Viterbi decoders were suggested but require changing the configuration to generate the required constraint length Viterbi decoder but are unable to adapt the changes on the fly [4]. In wireless communication, one of the main challenges is to maximize the channel 1-4244-1094-0/07/$25.00 ©2007 IEEE. Ahmad Zuri Sha’ameri Digital Signal Processing Lab University Technology Malaysia Malaysia 2.0 CHANNEL MODEL Three main categories of channel models are characterized i) AWGN only ii) Fast Fading iii) Slow Fading The block diagram for the channel model is shown in Figure 1. The Channel comprises of two main components (i.e.) AWGN ( G (t ) ) and Fading ( f (t ) ). The transmitted symbols r (t ) is multiplied with the fading samples f (t ) and 237 Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on January 5, 2009 at 20:42 from IEEE Xplore. Restrictions apply. number of states = 2 L −1 added with Gaussian noise G (t ) to form the received symbols R (t ) as described in (1). (3) The coding rate depicted in equation (1) could be changed with puncturing where selected bits are not transmitted and on the receiver side the locations where the bits were punctured is filled with an unbiased value [5]. R (t ) = f (t ).r (t ) + G (t ) (1) The fading models assumed are flat fading with a Rayleigh distribution. Nonfrequency selective model is chosen since the work is strictly on channel coding and in practice an equalizer is included as part of the overall digital communication system. Fast fading is assumed such that the coherence bandwidth is larger than a symbol period. Thus, the fading is independent between symbols. Assumptions based for slow fading is limited to roughly twenty symbols under deep fade. This is to avoid enhanced methods of interleaving. Simulations based on slow fading uses the standard block interleaver. The system is also assumed to be perfectly synchronized. 3.2 Viterbi Decoder The Viterbi decoder is the common method used to decode the convolutional codes. It uses the maximum likelihood estimation concept to predict the most likely transmitted sequence. P( Z | U ( m ') ) = max P( Z | U ( m ) ) over all U ( m ) (4) (m ) Where Z is the received sequence and U is one of the possible transmitted sequences, and chooses the maximum (the closest possible received sequence). The algorithm basically builds a trellis diagram of the most probable paths, and after some depth the paths are traced back to obtain the most likely transmitted sequence. The Viterbi decoder is capable of accepting soft bits or hard bits. Soft decision gives the decoder more than two levels of decision. Hard decision decoding provides the decoder with only two levels {0,1} and performs worse by around 2dB compared with soft decision. The main computational blocks in the Viterbi decoder are the Branch Metric computation and the Add Compare Select operation. The branch metric calculation will be based on either soft bits or hard bits. The trace back depth depends mainly on the memory management of the algorithm. The longer the trace back depth the larger the trellis will grow, and the larger the memory requirements. If the trace back depth is made too short, the performance of the codes will be affected drastically. An optimal trace back depth of 5 * L (constraint length) is used for unpunctured codes as described in [5]. 3.0 CHANNEL CODING METHODS The methods of channel coding discussed will be the convolution code and its decoding algorithm the Viterbi decoder. The block code used for burst error correction is the fire code. i) Convolutional Encoding ii) Viterbi Decoder 3.1 Convolution Encoder Convolution encoder basically builds memory to the information bits. In convolutional codes each block of k input bits is mapped onto a block of length n bits. This gives the code rate R of convolution codes as k R= (2) n The n output bits are not only determined by the present k information bits, but also by the previous bits which go through some memory structure determined by specified generator polynomials. There are different types of convolutional codes, differentiated by its constraint length L and its code rate R . Constraint length determines the complexity of the codes. The complexity is realized in the decoder structure. Constraint lengths exceeding L = 9 are too complex and are not realized using the Viterbi decoder. The constraint length relates to the number of states that exists on the decoder portion. The number of states is determined by 4.0 IMPLEMENTATION OF JOINT VITERBI ARCHITECTURE The Viterbi decoder consist of 5 main blocks as depicted in Figure 2. Figure 2:Viterbi decoder blocks 238 Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on January 5, 2009 at 20:42 from IEEE Xplore. Restrictions apply. The ACS unit is the most critical part in the design. This is the module that will determine that throughput / latency, area size and power efficiency of the design. The ACS unit requires The BMC unit generates branch metric for the ACS (Add Compare Select) unit for the selection of the most probable path. The BMC is adaptable to receive a rate 1/2 or rate 1/3 encoded data, which is set to a fixed soft value of size 4 bits. Using 4 soft bits gives a performance close to the ideal scenario. The branch metric was computed using Manhattan distance in order to reduce complexity in the design. Given X0 and X1, as the received 4 bit symbols for a code rate 1/2, the branch metric was computed as below: BM 00 = X 0 + X 1 BM 01 = X 0 + (15 − X 1) (5) BM 10 = (15 − X 0 ) + X 1 BM 11 = (15 − X 0 ) + (15 − X 1) 2 K −1 ACS nodes, where K is the constraint length of the code. The basic structure of the ACS unit is depicted in the below Figure 4: The pipeline of the branch metric was implemented using a straight forward mechanism to build all possible branches from 0 to 7 as depicted in the Table 1. Time 1 0 0 0 0 1 1 1 1 Figure 4: An ACS unit Therefore for a constraint length 7 Viterbi Decoder, the ACS unit consists of 64 ACS units. Several papers have suggested optimized implementation of the ACS unit with RAM modules to take care of the feedback of the updated path metrics. Such implementation creates delay in the decoding process and may impact on the system timing. Due to the feedback property of the path metric, the register size grows as the trellis is built over time during decoding. To avoid overflow, various suggestions were given to introduce normalization [10]. For this paper, since the application required small packet size, normalization was not required if the path metric size was 14 bits wide. Comparison was made by introducing normalization versus increasing the path metric size, and since due to the small packet length, simply increasing the bit width was more beneficial as normalization requires comparator logic. If packet size were to increase, normalization would at a certain point be more efficient in area. The surviving bit which is the output of the ACS unit, is collected in the survivor Time 2 Time 3 00 000 00 001 01 010 01 011 10 100 10 101 11 110 11 111 Table 1: Branch Metric Pipelining At Time instance 2, the branch metric for a code rate 1/2 is ready to be passed to the ACS unit, while in time instance 3, the branch is ready for a code rate 1/3. The hardware structure of a branch metric unit is shown below in Figure 3. K −1 . Full memory for all N states, where N = 2 traceback length was implemented to achieve high performance and the maximum memory size required was 64x612 bits. The assumption taken into consideration was that the memory would be shared between other baseband processing modules such as the equalizer and Figure 3: Branch Metric Unit 239 Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on January 5, 2009 at 20:42 from IEEE Xplore. Restrictions apply. that the Viterbi would not contribute to the overall memory growth in the design. The traceback unit would start once the memory module has finished processing the write operations and is ready to be read. Implemented in a FSM manner, the traceback traces the most possible path and outputs the decoded data in the reverse order. The final unit which is the LIFO basically flips the data into its correct order. Finally the control switch between the different Viterbi decoder is set by a 2 bit control input Config_K_r. The control input is configured as mentioned in the Table 2. Figure 5: Fast fading simulation results Config_K_r[1:0] Constraint Rate Length 00 5 1/2 01 5 1/3 10 7 1/2 11 7 1/3 Table 2: Configuration of Viterbi Decoder 5.0 RESULTS The performance of the reconfigurable Viterbi was first evaluated using Matlab under the three channel conditions discussed above. The plots of the Viterbi decoder of the configurable constraint length and rate is shown in figures 4,5 and 6 under the respective channel conditions. It is observed that under the slow fading channel the constraint length 7 rate 1/3 gives an increase of 3dB compared to the constraint length 5 rate 1/2, while in just Gaussian noise the gain was around 1dB. Also in the fast fading model the gain was around 2dB. Figure 6: Slow fading simulation results The hardware simulation was setup by first implementing the reconfigurable Viterbi decoder on an Altera APEXII 20K200E FPGA development board. The hardware simulation was setup using two boards, one being the encoder and the other the decoder. Using the UART protocol, two PC’s running HyperTerminal was set as depicted in Figure 5. Figure 5: Hardware setup Figure 4: AWGN simulation results Due to the slow communication speed of the UART, buffers were implemented to allow the handshake between the different clock domains. 240 Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on January 5, 2009 at 20:42 from IEEE Xplore. Restrictions apply. [3] IEEE Std 802.16e-2005 : “Part 16: Air interface for Fixed and Mobile Broadband Wireless Access Systems” The UART was set to run at 115200bps while the FPGA board was running on a 33 MHz clock. The overall gain in implementing the shared hardware architecture of the different Viterbi decoder is depicted in terms of Logic Elements (LE) in Table 3. [4] Abdulfattah Mohammad Obeid, A.Garcia Ortiz,”Prototyping of a High Performance Generic Viterbi Decoder” IEEE 2002. [5] Bernard Sklar, Digital Communication, Fundamentals and Applications, Prentice Hall, 2002. Viterbi architecture Logic Elements K =5, rate 1/2 1106 K =7, rate 1/2 3962 K =7, rate 1/3 4187 Reconfigurable 4676 Viterbi Table 3: Gain in terms of LE of shared architecture When compared with K=7, and rate 1/3 and K=5 rate 1/2 , the gain in LE of the reconfigurable Viterbi is around 7%. The reconfigurable architecture was further synthesized on design analyzer of Synopsys, to achieve a speed and throughput of 150MHz, due to the parallel structure of the ACS units. [6] Young Min Kim, William C.Lindsey ,”Adaptive Coded-Modulation in Multipath Fading Channels”,IEEE. 1999. [7] David M. Mandelbaum,”On Forward Error Correction with Adaptive Decoding”,IEEE Transactions on Information Theory, March 1975. [8] Yiquan Zhu, Mohammed Benaissa, “Reconfigurable Viterbi Decoding Using a New ACS Pipelining Technique”, IEEE 2003. [9] S.Swaminathan,“An FPGA-based Adaptive Viterbi Decoder” Master’s thesis, University of Massachusetts, Amherst, Department of Electrical and Computer Engineering, 2001. 6.0 CONCLUSIONS [10] C.B. Shung, P.H. Siegel, G. Ungerboeck and H.K. Thapar,“ VLSI architectures for metric normalization in Viterbi algorithm,“ IEEE International conference on communications,vol.4,pp.1723-1728,1990. Many suggested architecture, of the Viterbi decoder compromise the speed with area, [11] but for upcoming high data rate technologies such as Wimax, speed is the more critical issue. With an achievable speed of 150 MHz, the reconfigurable Viterbi decoder is able to satisfy the requirements of Wimax. For the GPRS and EDGE technologies, the reconfigurable Viterbi decoder gives an area advantage of 7% if there were two Viterbi cores implemented independently. Systems that require channel coding adaptability to improve throughput would benefit from the reconfigurable Viterbi decoder as to transmit a rate 1/2 when the channel is good and when the channel degrades, by switching to a different configuration mode the system would adapt to transmit with a rate 1/3. 7.0 REFERENCES [1] GSM 05:03: “Channel coding”, Version 8.9.0 Release 1999. [2] GSM 03.64: “Overall description of the GPRS radio interface; Stage 2” Version 8.12.0. Release 1999. 241 Authorized licensed use limited to: UNIVERSITY TEKNOLOGI MALAYSIA. Downloaded on January 5, 2009 at 20:42 from IEEE Xplore. Restrictions apply.
© Copyright 2026 Paperzz