Equalization and clock recovery for a 2.5 - 10Gb/s 2-PAM/4

ISSCC 2003 / SESSION 4 / CLOCK RECOVERY AND BACKPLANE TRANSCEIVERS / PAPER 4.6
4.6
Equalization and Clock Recovery for a
2.5 - 10Gb/s 2-PAM/4-PAM Backplane
Transceiver Cell
J. Zerbe1, C. Werner1, V. Stojanovic1, F. Chen1, J. Wei1, G. Tsang1, D.
Kim1, W. Stonecypher1, A. Ho1, T. Thrush1, R. Kollipara1, G-J. Yeh1,
M. Horowitz2, K. Donnelly1
Rambus, Los Altos, CA
Stanford University, Stanford, CA
1
2
The backplane environment presents a serious challenge to signaling rates above 5Gb/s. Previous 10Gb/s transceivers [1] are
not designed for this harsh environment. In the raw single bit
response of Fig. 4.6.1, a single 200ps pulse undergoes serious
loss and dispersion and initiates reflections that may be a significant percentage of an equalized eye. Figure 4.6.1 (inset)
shows a zoom-in of the reflections plotted on a scale equivalent
to a single 4-PAM equalized eye. The total usable amplitude
after equalization is slightly smaller than 3d which is the distance between the peak sample point and the next sample point
of the raw pulse response. While the use of multiple signaling
levels and transmit equalization help minimize the effects of dispersion [2,3], transmit-only equalization is an expensive way to
combat the effect of reflections which are more destructive to
multi-level signaling. Decision feedback-based receive equalization (DFE) is effective when dealing with configuration dependent reflections. The design of both transmit and receive equalizers and clock recovery circuits are described for operation in
this type of backplane environment.
Since dispersion is a function of many properties in backplanes,
flexibility in the transmit equalizer, both in number of taps and
their settings is highly desirable. One completely flexible
approach involves the use of a digital filter and DAC [4], while
the simplest approach is two-tap pre-emphasis [5]. Any technique must be evaluated for both additional insertion loss as well
as for power and complexity.
The five-tap merged differential transmitter/equalizer shown in
Fig. 4.6.2 leverages the fact that the total transmitted current is
limited to less than the sum of the maximum taps to reduce pad
parasitics. It achieves this by using large segments that can be
individually allocated to any tap position. The transmit equalizer is partitioned into a shared section and a dedicated section.
The shared section consists of 7 large sub-drivers, each driving
16i current, where each shared sub-driver can select from any of
the 5 equalization tap streams A - E. The dedicated portion consists of five binary weighted drivers, one for each equalization
tap, and each capable of driving up to 15i current. This combination of shared and dedicated drivers allows each equalization
tap to have the same current range, 127i, and resolution, 1i, of a
non-equalizing 7b transmitter with only 50% additional parasitic
overhead. Comparatively, a 5-tap transmitter/equalizer with the
same range and resolution implemented by replicating the primary driver has a 400% parasitic overhead. A pure digital filter
and DAC implementation requires a FIR filter running at the
symbol rate and consumes more than twice the power.
For receive equalization, the linearity and high bandwidth of the
transmission line environment were leveraged by adding and
subtracting currents directly at the input pads. The receive
equalizer is equivalent to a 1/5th scaled transmit equalizer.
High-latency reflections are effectively cancelled by a receive
equalizer in this configuration; it is preferred over a transmit
equalizer for this function as the old data is readily available in
the receive pipeline. The tap select MUX and tap weights are
separately configured for each backplane channel.
One difficulty with this type of receive equalizer is the timing alignment of the equalizer outputs to the incoming receive data.
Compensation is required as the equalizer has a clock-to-Q delay.
• 2003 IEEE International Solid-State Circuits Conference
This is accomplished by a limited-range variable delay element in
the equalizer clock path. This delay element is adjusted by a training sequence in the CDR loop with the CDR phase value held fixed.
The use of multi-level signaling to achieve higher data rates in
high-loss systems is well-understood [1,2]. Any system which
has > 10dB of loss difference between the 2-PAM and 4-PAM
Nyquist rates is a likely candidate for 4-PAM signaling. The 4PAM eyes at this point are comparable in amplitude to the 2PAM eyes due to the higher loss experienced by the 2-PAM signal. When using 4-PAM signaling, the effect of reflections must
be carefully considered as the size of the minimum eye relative
to the maximum transition has decreased by 2/3. In complex
backplanes some channels may have low loss and tolerate 2-PAM
signaling. Other channels may have higher loss and lower
reflections and thus will be better suited for 4-PAM operation.
This design supports either 2-PAM or 4-PAM operation via the
Gray coded levels as shown in Fig. 4.6.4. For 2-PAM transmission
the LSB is forced to zero and only major transitions occur. A flexible 2-PAM/4-PAM CDR is designed that uses the optimal transitions available for clock recovery in either mode. The complete set
of 4-PAM transitions is shown in Fig. 4.6.4 and consists of three
minor transitions (smallest change in voltage level possible), one
major transition (largest change possible) and two intermediate
transitions for a total of 6 transistions. If a conventional zerocrossing CDR such as [6] is used to recover the clock on raw 4PAM data, the problem arises that the edge distribution at the
MSB threshold is not uniform. Instead, there are three distinct
crossings. Similarly, the LSB thresholds also contain three distinct crossing regions. Such distributions cause jitter and phase
offsets if the data pattern exhibits a predominance of one transition type over another. In this design the optimal transitions (circled in Fig. 4.6.4) are used for clock recovery depending on mode.
In 2-PAM mode the MSB-major transition is used. In 4-PAM mode
the minor transitions of either the MSB or LSB are also included
while the transitions with skewed crossings are ignored. Both
clock jitter and phase offset are thus minimized. The CDR logic is
shown in Fig. 4.6.5. Both MSB and LSB edge and data samplers
are used. Adequate transition density can be assured through
means of scrambling, PRBS XOR or coding.
Results showing equalizer effectiveness are displayed in Fig.
4.6.6. The link is configured to run at 10Gb/s over a 20” backplane
with two connectors and two 3” linecards. In this simulation the
worst-case pattern is calculated based on the single bit response
and is overlaid with a PRBS pattern. Without transmit equalization the eye is completely closed and thus not shown. In Fig.
4.6.6a transmit equalization is enabled and the link is able to
establish an eye opening for the PRBS data but the eye is completely closed due to reflections for the worst-case pattern. Figure
4.6.6b shows the effect of adding receive equalization to effect
reflection cancellation. In this case significant margins of 47mV
and 60ps are obtained even for the worst-case pattern. When
operating the link in this environment at 10Gb/s the BER was
measured to be < 10-15 and power was measured at 450mW.
References
[1] M.M. Green, et al., "OC-192 Transmitter in Standard 0.18µm CMOS,"
ISSCC Digest of Technical Papers, pp. 248-249, 2002.
[2] J. Zerbe, et al., “A 2Gb/s/pin 4-PAM Parallel Bus Interface with
Crosstalk Cancellation, Equalization, and Integrating Receivers,” ISSCC
Digest of Technical Papers, pp. 66-67, 2001.
[3] R. Farjad-Rad, et al., IEEE J. Solid-State Circuits, vol. 35, pp. 757-764,
May 2000.
[4] C.-K. Yang, et al., “A Serial-Link Transceiver Based on 8-GSamples/s
A/D and D/A Converters in 0.25µm CMOS,” IEEE J. Solid-State Circuits,
vol. 36, pp. 1684-1692, November 2001.
[5] A. Fiedler, et al., “A 1.0625Gbps Transceiver with 2x-Oversampling
and Transmit Signal Pre-emphasis,” ISSCC Digest of Technical Papers,
pp. 238-239, 1997.
[6] K. Chang, et al., “A 0.4-4Gbb/s CMOS Quad Transceiver Cell Using OnChip Regulated Dual-Loop PLLs,” 2002 Symposium on VLSI Circuits
Digest of Technical Papers, pp. 88-91.
[7] S. Sidiropoulos, M. Horowitz, “A Semidigital Dual Delay-Locked Loop,”
IEEE J. Solid-State Circuits, vol. 32, pp. 1683-1692, November 1997.
0-7803-7707-9/03/$17.00
©2003 IEEE
ISSCC 2003 / February 10, 2003 / Salon 8 / 4:15 PM
d/2
1
TD[1:0]
4-PAM
Encoder
...
A[0]
B[0]
E[0]
A[0]
1/z
B[0]
A[0]
B[0]
1/z
...
E[0]
15i
max
16i
1/z
0
Dispersion
0
E[0]
FIGURE
4 2
2
6
8
10
Shared Thermometer Coded
Driver Segments (7)
ns
Figure 4.6.1: Raw backplane single bit response.
W A[0-3]
E[0]
...
0.2
DAC
...
Reflections
-d/2
This text is OK - Arial 20 pt. Bold
Allocation
Logic
...
d
15i
max
16i
A[0]
0.4
TN
...
...
...
W A[4-6]
A[2]
A[1]
0
4PAM
eyes
0.6
TP
W E[4-6]
0.8
DAC
W E[0-3]
Dedicated Tap
Drivers (5)
Figure 4.6.2: Folded 2-PAM/4-PAM transmitter/equalizer.
Normal Rx Path
Rx Data
CDR
UP/DOWN
Sampler
Minor
Phase
Mixer
01
Tap Select
Training
Sequence
Major
00
MSB
threshold
0101...
11
LSB
thresholds
...
Tap Weights
10
...
Calibrate
...
Variable
Delay
Receive
Equalizer
Figure 4.6.3: Receive equalization for reflection cancellation.
Figure 4.6.4: Optimal 4-PAM and 2-PAM CDR transitions.
(a)
mV
Tx EQ
(b)
mV
200
200
100
100
0
0
-100
-100
-200
-200
Tx+Rx EQ
2PAM/4PAM Mode
Early/Late
MSB
TranDet
CDR
transition
selection
LSB
TranDet
Tran
Majority
Voter
CDR clk
Phase
Mixer
Tran(2PAM) = MSBTran
Tran(4PAM) = (LSBTran * MSBTran) + (MSBTran * LSBTran)
0
Figure 4.6.5: 2-PAM/4-PAM CDR.
• 2003 IEEE International Solid-State Circuits Conference
100
200ps
0
100
200ps
Figure 4.6.6: Eyes with Tx EQ containing worst-case transitions
(a) without and (b) with receive equalizer.
0-7803-7707-9/03/$17.00
©2003 IEEE
4
4
Figure 4.6.7: Cell micrograph.
• 2003 IEEE International Solid-State Circuits Conference
0-7803-7707-9/03/$17.00
©2003 IEEE
d/2
1
0.8
0
4PAM
eyes
0.6
0.4
Reflections
d
-d/2
This text is OK - Arial 20 pt. Bold
0.2
0
Dispersion
0
2
FIGURE
4 2
6
8
10
ns
Figure 4.6.1: Raw backplane single bit response.
• 2003 IEEE International Solid-State Circuits Conference
0-7803-7707-9/03/$17.00
©2003 IEEE
TD[1:0]
4-PAM
Encoder
...
A[0]
B[0]
E[0]
A[0]
16i
A[0]
E[0]
...
1/z
A[0]
B[0]
W A[0-3]
DAC
...
B[0]
15i
max
...
Allocation
Logic
1/z
TN
...
...
...
W E[4-6]
A[2]
A[1]
W A[4-6]
TP
...
E[0]
16i
1/z
E[0]
Shared Thermometer Coded
Driver Segments (7)
15i
max
DAC
Dedicated Tap
Drivers (5)
Figure 4.6.2: Folded 2-PAM/4-PAM transmitter/equalizer.
• 2003 IEEE International Solid-State Circuits Conference
0-7803-7707-9/03/$17.00
W E[0-3]
©2003 IEEE
Normal Rx Path
Rx Data
UP/DOWN
CDR
Sampler
Phase
Mixer
Tap Select
Training
Sequence
0101...
...
Tap Weights
...
Calibrate
...
Receive
Equalizer
Variable
Delay
Figure 4.6.3: Receive equalization for reflection cancellation.
• 2003 IEEE International Solid-State Circuits Conference
0-7803-7707-9/03/$17.00
©2003 IEEE
Minor
Major
00
01
MSB
threshold
11
LSB
thresholds
10
Figure 4.6.4: Optimal 4-PAM and 2-PAM CDR transitions.
• 2003 IEEE International Solid-State Circuits Conference
0-7803-7707-9/03/$17.00
©2003 IEEE
2PAM/4PAM Mode
Early/Late
MSB
TranDet
CDR
transition
selection
LSB
TranDet
Tran
Majority
Voter
CDR clk
Phase
Mixer
Tran(2PAM) = MSBTran
Tran(4PAM) = (LSBTran * MSBTran) + (MSBTran * LSBTran)
Figure 4.6.5: 2-PAM/4-PAM CDR.
• 2003 IEEE International Solid-State Circuits Conference
0-7803-7707-9/03/$17.00
©2003 IEEE
(a)
mV
Tx EQ
(b)
mV
200
200
100
100
0
0
-100
-100
-200
-200
0
100
200ps
0
100
Tx+Rx EQ
200ps
Figure 4.6.6: Eyes with Tx EQ containing worst-case transitions (a) without and (b) with receive
equalizer.
• 2003 IEEE International Solid-State Circuits Conference
0-7803-7707-9/03/$17.00
©2003 IEEE
Figure 4.6.7: Cell micrograph.
• 2003 IEEE International Solid-State Circuits Conference
0-7803-7707-9/03/$17.00
©2003 IEEE