Clock Tree Synthesis

Introduction to Digital VLSI Design
‫ספרתי‬VLSI ‫מבוא לתכנון‬
Clock
Lecturer: Gil Rahav
Semester B’ , EE Dept. BGU.
Freescale Semiconductors Israel
Clock
Reference Signal in Sequential Elements.
o
Most critical signal in synchronous designs
o
Reference for all the timing measurements
o
Single clock signal serves multiple flops/latches
Clock tree concepts
Clock Insertion Delay = Tdly1, Tdly2
Clock Skew = Tdly2 – Tdly1
For a successful signal launch and capture :i)
ii)
Tck-qff1 + Tdly3 <= T + Skew – Tsetupff2
Tck-qff1 + Tdly3 >= Skew + Tholdff2
Clock Tree Concepts
Rise Skew :- Skew calculated based on Rise edge at
Clock Root
Fall Skew :- Skew calculated based on Fall edge at
Clock Root
Triggering edge Skew :- Skew calculated based on
arrival times of active signal on clock pins
Transition time :- The time taken by signal to make a
transition from 20%-80% of the full value.
Clock Tree Architecture
H-Tree Mesh
H-Tree Clock Network
If the paths are perfectly balanced, clock skew is zero
Clock
Can insert clock gating at
multiple levels in clock tree
Can shut off entire subtree
if all gating conditions are
satisfied
Idle
condition
Clock
Gated
clock
Clock Tree Synthesis :i) Achieve Insertion Delay number
ii) Achieve Skew Targets
iii) Maintain Transition limits
iv) Limit Power Numbers
Compare the three Structures …….?
SoC Clock Distribution Network
IP Core
or Module
SoC
Global Clock Net
Core Internal
Clock Net
Core Internal
Clock Driver/PLL:
• Buffer
• Freq. Multiply
• Align
External clock
PLL
Example: H-Tree
Clock
Distribution
PLL
Ref clk in
Bypass
3
PLL out
2
1
Ref clk out
Feedback
Restle et al., The clock distribution of the Power4 microprocessor, ISSCC2002
PLL Block Diagram
Reference
clock
Local
clock
Up
Phase
detector
Charge
pump
Loop
filter
vcont
VCO
Down
Divide by
N
System
Clock
Power reduction in Clock trees – Clock Gating
What’s the Problem ?
Clock variation
Difference in arrival times to flops
Static reasons = skew:
- Unequal wire length
- Unequal buffer delay
- Unequal load
- IR drop
- In-die process variation
Dynamic reasons = drift and jitter:
- Switching load supply voltage variation, IR drop, coupling
- Temperature
Two Methods of
Clock Distribution Networks
Zero skew
at clock inputs to IP cores
(a)
IP 1
Zero skew
at the flip-flops
(b)
IP 1
IP 2
IP 2
IP 3
IP 3
clock
distribution
network
clock
IP 4
IP 4
IP 5
clock
distribution
network
clock
IP 6
IP 6
IP 7
IP 7
IP 8
IP 8
IP 9
IP 9
0
IP 5
clock delay
Assume perfectly balanced clock tree !
0
clock delay
Clock Tree Synthesis
when?
Clock Tree synthesis is needed for nets which have high fanout:
➢
Clocks
➢
Asynchronous resets
➢
Scan signals which feed all the Flip-flops in the design
Clock Tree Synthesis
Why?
➢
Minimal skew .
➢
Minimal insertion delay .
➢
DRC (Design Rule Constraints) – max transition, max capacitance
Clock Tree Synthesis
Why?
➢
Minimum Skew – Hold violation
B
A
C
Hold time violation when A + B < C
Clock Tree Synthesis
Why?
➢
Minimum Skew – Setup violation
B
A
C
Setup time violation when A + B > C + T
Clock Tree Synthesis
Why?
➢
Minimum Insertion Delay / Buffers Stages
Large insertion delay increase power but also results with
increased skew cause of On-Chip-Variation (OCV) .
Total
skew
Timing
Violation
Add
Buffers
On Logic
Path
Total
Power
& Area
Clock Tree Synthesis
Symmetric
➢
➢
➢
Clock tree
All flops of a symmetric clock tree , traced back from the clock
tree root are passing the same number of levels and the same
cell references at each level.
The clock tree is balanced at a specific corner which should fit
all corners .
Asymmetric tree results with increased skew variations at
different corners .
Clock Tree Synthesis
Asymmetric
Clock Tree
Asymmetric Clock Tree is used for non clock signals such as
asynchronous resets & DFT signals.
Asymmetric clock tree features:
➢
Requires max delay & max transition.
➢
Relaxed constraints for skew.
Review: Synchronous Timing Basics
R1
In
clk
D Q
tclk1
tc-q, tsu,
thold, tcdreg
R2
Combinational
logic
D Q
tclk2
tplogic, tcdlogic
Under ideal conditions (i.e., when tclk1 = tclk2)
T ≥ tc-q + tplogic + tsu
thold ≤ tcdlogic + tcdreg
Under real conditions, the clock signal can have both
spatial (clock skew) and temporal (clock jitter) variations
skew is constant from cycle to cycle (by definition); skew can be
positive (clock and data flowing in the same direction) or negative
(clock and data flowing in opposite directions)
jitter causes T to change on a cycle-by-cycle basis
Sources of Clock Skew and Jitter in Clock Network
4 power supply
3 interconnect
6 capacitive load
clock
1
generation
PLL
7 capacitive
coupling
2 clock drivers
5 temperature
Skew
manufacturing device
variations in clock drivers
interconnect variations
environmental variations
(power supply and
temperature)
Jitter
clock generation
capacitive loading and
coupling
environmental variations
(power supply and
temperature)
Positive Clock Skew
Clock and
data flow in
the same
direction
R1
In
R2
Combinational
logic
D Q
D Q
tclk1
clk
tclk2
T
1
delay
T+δ
δ>0
2
3
4
δ + thold
T:
T + δ ≥ tc-q + tplogic + tsu so T ≥ tc-q + tplogic + tsu - δ
thold :
thold + δ ≤ tcdlogic + tcdreg so thold ≤ tcdlogic + tcdreg - δ
δ > 0: Improves performance, but makes thold harder to
meet. If thold is not met (race conditions), the circuit
malfunctions independent of the clock period!
Negative Clock Skew
Clock and
data flow in
opposite
directions
R1
In
R2
D Q
Combinational
logic
tclk1
D Q
tclk2
delay
T
T+δ
1
2
δ<0
3
4
T:
T + δ ≥ tc-q + tplogic + tsu so T ≥ tc-q + tplogic + tsu - δ
thold :
thold + δ ≤ tcdlogic + tcdreg so thold ≤ tcdlogic + tcdreg - δ
δ < 0: Degrades performance, but thold is easier to meet
(eliminating race conditions)
clk
Clock Jitter
Jitter causes T to
vary on a cycle-bycycle basis
R1
Combinational
logic
In
tclk
clk
T
-tjitter
+tjitter
T : T - 2tjitter ≥ tc-q + tplogic + tsu so T ≥ tc-q + tplogic + tsu + 2tjitter
Jitter directly reduces the performance of a sequential
circuit
Combined Impact of Skew and Jitter
Constraints
on the
minimum
clock period
(δ > 0)
R1
In
R2
Combinational
logic
D Q
D Q
tclk1
tclk2
T
1
T+δ
δ>0
6
12
-tjitter
T ≥ tc-q + tplogic + tsu - δ + 2tjitter
thold ≤ tcdlogic + tcdreg – δ – 2tjitter
δ > 0 with jitter: Degrades performance, and makes thold
even harder to meet. (The acceptable skew is reduced
by jitter.)
Clock Distribution Networks
Clock skew and jitter can ultimately limit the performance
of a digital system, so designing a clock network that
minimizes both is important
In many high-speed processors, a majority of the dynamic power
is dissipated in the clock network.
To reduce dynamic power, the clock network must support clock
gating (shutting down (disabling the clock) units)
Clock distribution techniques
Balanced paths (H-tree network, matched RC trees)
- In the ideal case, can eliminate skew
- Could take multiple cycles for the clock signal to propagate to the
leaves of the tree
Clock grids
- Typically used in the final stage of the clock distribution network
- Minimizes absolute delay (not relative delay)
Dealing with Clock Skew and Jitter
To minimize skew, balance clock paths using H-tree or other
clock distribution structures.
If possible, route data and clock in opposite directions;
eliminates races at the cost of performance.
The use of gated clocks to help with dynamic power
consumption make jitter worse.
Shield clock wires (route power lines – VDD or GND – next to
clock lines) to minimize/eliminate coupling with neighboring
signal nets.
Use dummy fills to reduce skew by reducing variations in
interconnect capacitances due to interlayer dielectric
thickness variations.
Beware of temperature and supply rail variations and their
effects on skew and jitter. Power supply noise fundamentally
limits the performance of clock networks.