Constraining Input Paths Input Signals External Delay Setting Input

3/1/2017
Constraining Input Paths
Input Signals
• Now that a clock has been specified, logic
signals may be constrained relative to that
clock.
• Signals coming from outside may not arrive
at the beginning of the clock period.
• The compiler needs to know that.
FF1
Block to be synthesized
Even if there is no external combinational logic following
FF1, just wiring delays may cause enough delay to require
an input signal constraint.
195
External Delay
196
Setting Input Delay Constraint
• Setup time is superfluous: that is a library
parameter and a suitable flipflop will be
chosen by the synthesizer.
• Period is specified by create_clock.
• The others are needed. Input delay = FF
delay + external combinational delay.
• Internal combinational delay isn’t known.
That is what the synthesizer needs to deal
with.
• Example: period is 10 ns, clock to Q delay
is 0.5 ns, FF setup is 0.2 ns, external logic
path takes 2.3 ns.
• What do you need to tell the synthesizer to
properly constrain the input path?
• Is the above information enough? Too
much?
197
Input Delay Constraint
198
Input Constraint
• External circuitry burns 2.8 ns of 10 ns
clock period.
• Need to set that as a constraint.
• Constraint is relative to clock, which must
have been previously created.
• Max delay for setup, min for hold.
• set_input_delay –max 2.8 –clock CLK \
[get_ports IN1]
• You specify how much of the period
external circuitry needs.
• The compiler calculates how much is left
and tries to make internal logic fit.
199
200
1
3/1/2017
Input Constraint
Multiple Ports
Capture Edge
Launch Edge
• Usually it’s unnecessary to individually
constrain each input.
• Wildcards and all inputs may be specified.
– set_input_delay 1 [all_inputs] –clock \
$clk_name
– set_input_delay 1 [get_ports IN*] –clock \
$clk_name
Delay of external logic
Amount left for internal logic
201
Don’t Constrain Clock to Itself
202
Hold Constraint
• Using the all_ports option will include the
clock.
• Doesn’t make sense to constrain clock by
referencing clock.
• Avoiding a hold violation can be a problem
too.
• Addressed by setting a minimum input
delay.
– set_input_delay 2 –clock CLK –max \
[remove_from_collection [all_inputs] \
[get_ports CLK]]
– set_input_delay –min 0.3 –clock CLK \
[get_ports * – filter {@port_direction == in}]
• Alternative syntax
– set_input_delay –max 0.5 –clock CLK [all_inputs]
– remove_input_delay [get_ports CLK]
203
Minimum Delay Needed
204
Constraining Output Paths
Designers in extremis may not register all outputs:
Block to be synthesized
Block to be synthesized
To avoid a hold violation, there may need to be a
minimum delay on data signals.
External logic
Top level of design
205
206
2
3/1/2017
Output Constraint
Output Delay Parameters
• You specify how much time is needed by
the external logic.
• Synthesizer tries to make internal logic to fit
what is left of period.
• Always in reference to a previously-created
clock.
CB1
FF1
Block to be synthesized
CB2
FF2
External logic
FF2 Setup Time: 1 ns
FF2 Hold Time: 0.5 ns
CB2 Minimum Propagation Delay (best case): 0
207
CB2 Maximum Propagation Delay (worst case): 3 ns
208
Minimum Delay
Minimum Delay Example
• Minimum output delay is to prevent hold
violation.
• Minimum delay = combinational delay –
hold time.
• This can lead to a negative minimum delay.
That is OK. It tells the compiler that the
circuit to be synthesized must add at least
that much delay.
• Hold time = 0.5 ns
• External combinational delay may be as low
as zero.
• Minimum delay = 0 – 0.5 = – 0.5
• set_output_delay –clock CLK –min – 0.5 \
[get_ports COUNT*]
• Output delay is delay provided by external
device.
209
210
Maximum Output Delay
Another Example
• Output delay is still the delay provided by
the external device.
• Maximum delay includes combinational
logic and register setup time.
• Use worst case (largest) values.
• set_output_delay –clock CLK –max 4 \
[get_ports COUNT*]
•
•
•
•
•
•
211
10 ns clock, rising at time 0 and 10.
Hold time is 1 ns.
Setup time is 2 ns
Minimum external logic delay is 0.
Maximum external logic delay is 3.
Need to set output constraints.
212
3
3/1/2017
Data Stability
Output Delay, Min and Max
• Clock frequency, setup and hold parameters
define when data need to be stable.
• Hold = 1: Data need to be stable at time = 11.
• Setup = 2: Data need to be stable at time = 8.
• Best case: combinational delay = 0. Hold
time needs to be covered by synthesized
circuit.
• Worst case: Maximum delay of external
logic (3) + setup time (2). Leaves only half
the period for synthesized circuit.
• Need two output delay constraints:
set_output_delay –clock CLK –min –1 \
[get_ports OUT1]
set_output_delay –clock CLK –max 5 \
[get_ports OUT1]
213
Timing Constraint Example
214
Where Does the Delay Go?
• Previous examples of setting delays
explicitly put them on the ports.
• Suppose the [get_ports] was left out.
• Would the delay be associated with the port
(external) or the net (internal)?
create_clock CLK –period 10
set_output_delay 5 –clock CLK –max \
[get_ports all_outputs]
set_output_delay -1 –clock CLK –min \
[get_ports all_outputs]
215
One Name, Two Objects
216
It’s Ambiguous, It Does Matter
TOP_DESIGN
SUM
DFF
> set_load 5 SUM
Does the load get put on the internal net SUM or the port?
Does it matter?
217
• If load is applied to the net, it will replace
whatever was calculated by DC (using wire load
model or other algorithm).
• If it is applied to the port, it will be added to the
internal net load.
• As it happens, in this case if no object type is
specified, the load will be applied to the port and
added to the load of the net.
• Good technique is to always specify the object
type (port, net, etc.).
218
4
3/1/2017
Output Capacitive Loading
•
•
•
•
That Looks Awkward
Output loading affects transition times.
Design Compiler assumes zero load.
Leads to over-optimistic design.
Fix is easy if you happen to know the load.
– set__load [expr 30.0 /1000] [get_ports OUT1]
set__load [expr 30.0 /1000] [get_ports OUT1]
It’s done that way to accommodate the library capacitance
units: in this case, library units are picofarads, but the load is
30 femtofarads. Use of decimal point (30.0) ensures a
floating point operation rather than rounding to nearest
integer.
219
What If You Have No Idea?
220
Load Of Example
• Previous example used a magic number, 30
fF.
• On an internal block, usually you won’t
have any idea what that number will be.
• But you will know that it’s going to drive a
gate or fan out to several gates.
• So tell it what gate you expect it to drive.
You expect output OUT1 to be connected to the a input port of
one an2 and gate of the LSI 10k library:
set_load [load_of LSI_10k/an2/a] [get_ports OUT1]
For a fanout to three of them:
set_load [expr [load_of LSI_10k/an2/a] * 3] [get_ports OUT1]
221
Input Side
222
Objects, Attributes
• Similar technique for input cell: if your
inputs are not primary, they will be driven
by some other library cell.
• Designs contain six object types:
– Designs, cells, ports, pins, nets, clocks
• Ports belong to current design, pins are I/O of
instances of sub-designs
– set_driving_cell –lib_cell FD2 –pin Q \
[get_ports IN1]
• Objects have attributes
– Ports have direction, load, many more
– Cells can have attributes such as dont_touch,
is_mapped, is_sequential, many more
• If you do happen to know the transition
time of the driving cell (less likely), use it:
– set_input_transition 0.12 [get_ports IN1]
223
224
5
3/1/2017
For More Information
Register to Register Paths
• Most timing arcs are between registers
within a module.
• These arcs are (almost) always within a
synchronous system: same clock, more or
less, at both registers.
• create_clock handles most of them, but
there are exceptions.
list_attributes –application –class <object_type>,
where object_type can be pin, port, design, cell, net,
clock, lib, reference, etc.
225
Timing Exceptions
•
•
•
•
226
False Paths
False paths
Multicycle paths
Minimum delay
Maximum delay
• Disables timing checking between specified
points.
• Can be used on single-point paths.
• Mostly used across clock domains.
227
False Path
Data
D
Gate
G
228
Design False Path
Simultaneously writing new data and re-storing the old
data isn’t going to happen. The path from the latch G to D
through the multiplexer is a false path.
Contrived example but there are real paths that can never
be sensitized.
229
230
6
3/1/2017
Design False Path
Crossing Clock Domains
• Previous example shows a false path that is
not inherently impossible.
• Knowledge of external design allow false
path decision to be made.
• In this example, SEL1 and SEL2 are
decodes of one-hot machine: only one can
ever be active at any given time.
231
False Paths Across Domains
232
False Reset Path
• Tools do not differentiate between
asynchronous reset and other logic signals.
• Reset release DOES need to be
synchronized but that is handled with
synchronizer (covered soon).
• Reset exists in its own timing domain.
• Path from reset is thus a false path.
set_false_path -from [get_clocks CLKA] -to [get_clocks \
CLKB]
set_false_path -to [get_clocks CLKA] -from [get_clocks \
CLKB]
233
Multicycle Paths
234
Slow Arithmetic Operator
• Some processes take more than one clock
cycle.
– Arithmetic operations in particular.
• Clock may be kept running throughout the
operation.
• Don’t want to report timing errors before
operation is completed.
235
236
7
3/1/2017
Multicycle Clock Constraint
Multicycle Path Example
create_clock CLK –period $CLKPERIOD
set_multicycle_path 6 –setup –to get_pins[c_reg[*]/D]
#c_reg is 64 bits wide: find all instances of it and create a
#multicycle clock constraint to only flag timing errors on
#the D input if data are not stable in 6 clock periods.
#Hold constraint is unnecessary because hold time will
#only be checked for the edge before the setup edge.
237
Min. and Max. Delays
238
Maximum Delay
• Minimum and maximum delays can be set
on any path.
• Used to over-ride (make tighter) constraints
set by clock period.
• Can be used for purely combinational
blocks.
• A combinational block may be constrained
with a virtual clock.
• Paths can also be constrained using a
maximum delay specification
– set_max_delay –from [get_ports IN*] –to \
[get_ports OUT*]
• Minimum delays can be set analogously.
239
240
Load Models
Wire Loads and Models
• So far, wire loading has been ignored.
• In 1990, that was marginally OK.
• In new processes, wire delay dominates gate
delay.
• Tools are adapting to these changes.
241
• Wire load models are entirely process dependent.
Completely different from one technology to another.
• Semiconductor vendors create them based on statistical
data.
• Better than nothing.
• Much worse than back annotation data.
• Used for synthesis, so back annotation data are not
available.
• Not adequate for the most modern nanoscale libraries.
New “topographic mode” tools for that.
242
8
3/1/2017
Model Components
Types of Models (Synopsys Specific)
• Coefficients for
• Top
– Area
– Capacitance
– Resistance per unit length
– No hierarchy. Nets are modeled for a flat
design. Same wireload is used for all nodes in
the entire design.
• Enclosed
• Fanout to length table
– Uses the wire load model of the smallest design
unit that entirely encloses the net under
evaluaton.
– Used to estimate length
243
Types of Models
244
Top Wireload Mode
• Segmented
– Nets crossing hierarchical boundaries have
different loading properties at different points.
245
Enclosed Wireload Mode
246
Segmented Wireload Mode
247
248
9
3/1/2017
What Wire Loads Are Available?
Smallest Wire Load
Wire Loading Model:
• Use report_lib to find out.
Name
: 05x05
Location
: lsi_10k
Resistance
: 0
Capacitance : 1
Area
: 0
Slope
: 0.186
Fanout Length Points Average Cap Std Deviation
-------------------------------------------------------------1 0.39
– report_lib lsi_10k
• Which to choose?
– What are the units?
249
Largest Wire Load
250
Selection Chart
Name
: 90x90
Location
: lsi_10k
Resistance
: 0
Capacitance : 1
Area
: 0
Slope
: 1.64
Fanout Length Points Average Cap Std Deviation
-------------------------------------------------------------1 1.90
251
Automatic Wire Load Selection
• Manually selecting the best wire load model is
error-prone.
• Most modern libraries have automatic wire load
selection.
• lsi_10k isn’t modern.
• Automatic wire load selection slows down
synthesis. May be turned off.
252
Effect on Design
• Same design, same timing constraints.
• Added 10x10 wire load model.
– set auto_wire_load_selection off
• Automatic wire load selection may be overly
optimistic if the design being synthesized is part of
a larger block.
253
254
10
3/1/2017
Point
Incr
Path
----------------------------------------------------------clock CLK (rise edge)
0.00
0.00
clock network delay (ideal)
0.00
0.00
COUNT_reg[1]/CP (FD2)
0.00
0.00 r
COUNT_reg[1]/Q (FD2)
1.63
1.63 r
U4/Z (AN3)
0.98
2.61 r
U3/Z (EO)
1.13
3.73 f
COUNT_reg[3]/D (FD2)
0.00
3.73 f
data arrival time
3.73
Speed Impact—9%
----------------------------------------------------------clock clk (rise edge)
0.00
0.00
clock network delay (ideal)
0.00
0.00
cnt_reg[1]/CP (FD2)
0.00
0.00 r
cnt_reg[1]/Q (FD2)
1.84
1.84 r
U4/Z (AN3)
1.06
2.90 r
U3/Z (EO)
1.16
4.06 f
cnt_reg[3]/D (FD2)
0.00
4.06 f
data arrival time
4.06
clock clk (rise edge)
5.00
5.00
clock network delay (ideal)
0.00
5.00
cnt_reg[3]/CP (FD2)
0.00
5.00 r
library setup time
-0.85
4.15
data required time
4.15
----------------------------------------------------------data required time
4.15
data arrival time
-4.06
----------------------------------------------------------slack (MET)
0.09
255
Area Increased by About 6%
clock CLK (rise edge)
5.00
5.00
clock network delay (ideal)
0.00
5.00
COUNT_reg[3]/CP (FD2)
0.00
5.00 r
library setup time
-0.85
4.15
data required time
4.15
----------------------------------------------------------data required time
4.15
data arrival time
-3.73
----------------------------------------------------------slack (MET)
0.42
256
Operating Conditions
Number of ports:
Number of nets:
Number of cells:
Number of references:
6
13
9
5
Combinational area:
Noncombinational area:
Net Interconnect area:
net area)
12.000000
38.000000
undefined (Wire load has zero
Total cell area:
Total area:
50.000000
undefined
• Semiconductor performance is not static—
varies with conditions.
• report_lib also tells what the options are for
operating conditions.
• lsi_10k:
– Best, worst, for military, industrial and
commerical.
– Six options.
257
Report Lib
258
You Set the Conditions
Operating Conditions:
Name
Library
Process Temp
Volt
______________________________________________
lib default
nom_90nm 1.00
25.00
1.00
WCCOM
nom_90nm 1.50
70.00
0.95
WCMIL
nom_90nm 1.50
125.00
0.90
BCCOM
nom_90nm 0.60
0.00
1.05
BCMIL
nom_90nm 0.60
-55.00
1.10
259
• Library may default to something.
• You can change it to something better or
worse.
• Must always check the worst case
conditions for your target process
– set_operating_conditions –max “WCCOM”
260
11
3/1/2017
Default
Setup File Library Selection
• You can determine what, if anything, the
library default is:
set target_library MYVENDOR_90nm_wccom.db
set link_library “* MYVENDOR_90nm_wccom.db”
– get_attribute my_lib default_operating_conditions
• Some semiconductor vendors don’t use
operating conditions. Instead, they provide a
separate library for each condition.
• In that case, change conditions by changing
the library chosen by your setup file.
Change setup file for different case analysis.
There’s no need/no way to change operating conditions
within a library.
261
262
Min:Typ:Max
Process
• Semiconductor performance is typically
specified as a timing triplet: best case,
typical case and worst case.
• These are combinations of process, voltage
and temperature.
• The three together define the envelope in
which the device can be called upon to
operate.
• Users have no control over process. Some
days the line produces faster chips, some
days slower.
• The fab guarantees that the chips they ship
will fit someplace between what they do on
their best days and what they do on the
worst.
263
Voltage
264
Performance and Supply Voltage
• A chip will have a nominal operating
voltage, such as 5 for legacy components,
less for modern ones.
• Five does not mean 5.000000000
• It’s 5 +/- something, usually 0.5 or 0.25 V.
You need some slack on the power supply,
it drops out a bit with loading.
• Performance tends to go up (delays go
down) with increasing voltage, to a point.
• Thus a circuit will be faster when it has a
best-case power supply (i.e. 5.5 Volts) than
when it has a worst-case one (4.5 Volts).
265
266
12
3/1/2017
Temperature
Temperature Ranges
• Performance also changes with temperature.
• CMOS delays tend to be linear with
temperature for both rise and fall times.
• TTL is not, fall times decrease with rising
temperatures while rise time increase.
• Each component will have a specified
temperature operating range.
• Military temperature range is -55 C to +125
C.
• Commercial is 0 C to 70 C.
• Some processes also have “industrial,”
between commercial and military.
• Delay characteristics are specified as a
Min:Typ:Max triplet.
267
Operating conditions
268
Operating Cases
worst
worst
typical
worst
typical
best
• Traditionally (traditions are short in this
business) semiconductor variations could
just be lumped as best case/worst case.
• As geometries shrink, it has become evident
that many factors contribute to operating
case.
typical
Delay
Delay
best
Delay
best
temperature
Process
min
typical
Voltage
max
– effective channel length, gate width, gate oxide
thickness, dopant concentration, interlayer
dielectric thickness
% yield
PVT (Process, Voltage and
Temperature curve)
269
270
Ref: http://homepages.cae.wisc.edu/~lxie/SSTA.shtml
Operating Conditions—LSI 10K
Operating Condition Name : WCMIL
Library : lsi_10k
Process : 1.50
Temperature : 125.00
Voltage : 4.50
Interconnect Model : worst_case_tree
Worst Case Tree
Load pins (signal destination) are at the extreme end
of wires. Each load incurs the full wire capacitance
and wire resistance.
R
C
271
272
13
3/1/2017
Best Case Tree
Operating Conditions—Best Case
Each destination is adjacent to the driver. All the net
capacitance is incurred but no wire resistance.
Operating Condition Name : BCMIL
Library : lsi_10k
Process : 0.60
Temperature : -55.00
Voltage : 5.50
Interconnect Model : best_case_tree
C
273
BCMIL Timing
274
WCMIL Timing
Point
Incr
Path
----------------------------------------------------------clock clk (rise edge)
0.00
0.00
clock network delay (ideal) 0.00
0.00
cnt_reg[1]/CP (FD2P)
0.00
0.00 r
cnt_reg[1]/Q (FD2P)
0.55
0.55 f
U18/Z (ND3P)
0.27
0.83 r
U17/Z (ENP)
0.41
1.24 f
cnt_reg[3]/D (FD2)
0.00
1.24 f
data arrival time
1.24
Point
Incr
Path
----------------------------------------------------------clock clk (rise edge)
0.00
0.00
clock network delay (ideal) 0.00
0.00
cnt_reg[1]/CP (FD2P)
0.00
0.00 r
cnt_reg[1]/Q (FD2P)
3.50
3.50 f
U18/Z (ND3P)
1.72
5.23 r
U17/Z (ENP)
2.58
7.81 f
cnt_reg[3]/D (FD2)
0.00
7.81 f
data arrival time
7.81
clock clk (rise edge)
5.00
5.00
clock network delay (ideal) 0.00
5.00
cnt_reg[3]/CP (FD2)
0.00
5.00 r
library setup time
-0.85
4.15
data required time
4.15
----------------------------------------------------------data required time
4.15
data arrival time
-1.24
----------------------------------------------------------slack (MET)
2.91
clock clk (rise edge)
5.00
5.00
clock network delay (ideal) 0.00
5.00
cnt_reg[3]/CP (FD2)
0.00
5.00 r
library setup time
-0.85
4.15
data required time
4.15
----------------------------------------------------------data required time
4.15
data arrival time
-7.81
----------------------------------------------------------slack (VIOLATED)
-3.66
275
Balanced Tree
Operating Conditions Set Type
Third and last type of tree: usually used in typical case loading
model. All load pins are on separate and equal branches of the
interconnect wire. Each load incurs an equal percentage of the
total wire capacitance and wire resistance.
Balanced tree is the default, if not otherwise specified.
R/2
R/2
C/2
C/2
276
277
• Tree type (best, worst, balanced) is set when
operating conditions (BCMIL, WCCOM,
etc. are set.
• You do not have independent control over
this.
• Tree type is a vendor-set parameter of each
operating condition.
278
14
3/1/2017
Report Constraints
• Once all the constraints are set, check that
they are what you expected:
– report_design
279
15