A fast general slew constrained minimum cost buffering algorithm

ARTICLE IN PRESS
Microelectronics Journal ] (]]]]) ]]]–]]]
Contents lists available at ScienceDirect
Microelectronics Journal
journal homepage: www.elsevier.com/locate/mejo
A fast general slew constrained minimum cost buffering algorithm$
Shiyan Hu a,, Jiang Hu b
a
b
Department of Electrical and Computer Engineering, Michigan Technological University, Houghton, MI 49931, USA
Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA
a r t i c l e in fo
abstract
Article history:
Received 22 December 2008
Accepted 10 August 2009
As VLSI technology moves to the nanoscale regime, ultra-fast slew buffering techniques considering
buffer cost minimization are highly desirable. The existing technique proposed in [S. Hu, C. Alpert, J. Hu,
S. Karandikar, Z. Li, W. Shi, C.-N. Sze, Fast algorithms for slew constrained minimum cost buffering, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems 26 (11) (2007) 2009–2022]
is able to efficiently perform buffer insertion with a simplified assumption on buffer input slew.
However, when handling more general cases without input slew assumptions, it becomes slow despite
the significant buffer savings. In this paper, a fast buffering technique is proposed to handle the general
slew buffering problem. Instead of building solutions from scratch, the new technique efficiently
optimizes buffering solutions obtained with the fixed input slew assumption. Experiments on industrial
nets demonstrate that our algorithm is highly efficient. Compared to the commonly used van Ginneken
style buffering, up to 49 speed up is obtained and often 10% buffer area is saved. Compared to the
algorithm without input slew assumption proposed in [S. Hu, C. Alpert, J. Hu, S. Karandikar, Z. Li, W. Shi,
C.-N. Sze, Fast algorithms for slew constrained minimum cost buffering, IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems 26 (11) (2007) 2009–2022], up to 37 speedup can be
obtained with slight sacrifice in solution quality.
& 2009 Elsevier Ltd. All rights reserved.
Keywords:
Buffer insertion
Slew constraint
Non-fixed input slew
Interconnect optimization
Physical design
1. Introduction
As VLSI technology moves to the nanoscale regime, devices
scale much faster than interconnects. As a highly effective
interconnect optimization technique, buffer insertion is extensively studied [3–7]. It has been widely deployed in industry as
demonstrated in [8] which shows that about 25% gates are buffers
in recent IBM ASIC designs. On the other hand, interconnect
resistivity may lead to the significant degradation on signal
integrity. This issue severely aggravates with advancing technologies. As such, buffers need to be inserted also for meeting slew
constraints [2].
In practice, slew constraint is significantly more prevalent than
timing constraint [2]. Once nets are buffered for satisfying slew
constraints, most of them will automatically satisfy timing
constraints. In fact, it is documented in [8] that in IBM ASIC
designs, for about 95% nets, buffering based on slew is sufficient to
meet their timing constraints, while only about 5% nets need to be
re-buffered for timing optimization.
$
A preliminary version of the paper appeared in [1].
Corresponding author. Tel.: +1 906 487 2941; fax: +1 906 487 2949.
E-mail addresses: [email protected] (S. Hu), [email protected] (J. Hu).
This suggests a better way of using buffer insertion techniques
in the physical synthesis flow [8]. Suppose that we are to buffer
millions of nets. They are first buffered using slew driven
buffering techniques. After performing timing analysis on the
resulting nets, we find that about 5% nets violate timing
constraints. Only these timing critical nets need to be ripped up
and re-buffered by timing driven buffering techniques [2].
The main benefit from this new physical synthesis methodology is the huge gain in efficiency since slew buffering can be
performed very efficiently. It is demonstrated in [2] that a slew
driven buffering algorithm can run up to 88 faster than the
timing driven buffering algorithm. For example, one can buffer
1000 industrial nets in only 6.2 s by the minimum cost slew
buffering algorithm, while the minimum cost timing buffering
algorithm needs 548.9 s. Note that minimum cost buffering is
important since excessive buffers may cause many design
issues such as high power consumption. To keep the overall
design quality, it is crucial to minimize the usage of buffering
resources [2].
This fast slew buffering algorithm proposed in [2] needs an
important assumption, namely, the input slew to each buffer is
assumed to be fixed at a conservative upper bound. With this
input slew assumption, slew buffering can be efficiently performed under the dynamic programming framework. Certainly,
improvement in buffer area is desired if this assumption is
0026-2692/$ - see front matter & 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.mejo.2009.08.003
Please cite this article as: S. Hu, J. Hu, A fast general slew constrained minimum cost buffering algorithm, Microelectron. J (2009),
doi:10.1016/j.mejo.2009.08.003
ARTICLE IN PRESS
2
S. Hu, J. Hu / Microelectronics Journal ] (]]]]) ]]]–]]]
eliminated. As such, they [2] also propose a slew buffering
algorithm without the fixed input slew assumption. The idea is to
first discretize every possible input slew into slew bins and carry
out the fixed input slew buffering algorithm with each bin. Since
the solutions associated with a slew bin may be switched to other
slew bins, numerous solutions can be generated. Experimental
results in [2] show that although about 20% area saving can be
obtained, the algorithm is not efficient: it is even slower than
timing driven buffering in many cases. Thus, it is not very
attractive since a major reason for using slew buffering is its high
efficiency. In order to make the approach practical, it is crucial to
design a fast slew buffering algorithm for handling the non-fixed
input slew case.
This work proposes a new fast slew buffering algorithm
without input slew assumptions. In contrast to [2] which builds
slew buffering solutions from scratch, we perform optimizations
to buffering solutions obtained with the fixed input slew
assumption. For this purpose, a heuristic is proposed to improve
buffer usage under the slew constraint and it runs very fast.
Together with the fact that slew buffering with fixed slew
assumption can be efficiently computed, the whole approach
runs very fast.
Our experimental results demonstrate the effectiveness and
the efficiency of the new algorithm. Our algorithm runs up to 49
faster than the timing buffering algorithm with about 10% buffer
area saving. Compared to the slew buffering algorithm without
input assumptions proposed in [2], up to 37 speedup is obtained.
Thus, our work makes the general slew buffering technique
practical. It is expected that the new algorithm would be widely
used in practice due to its high efficiency in both runtime and
buffer usage.
Note that there is another recent work in [9] which proposes a
low-power buffering algorithm handling both timing constraint
and slew constraint for timing critical nets. In contrast, the
purpose of this paper is to address slew buffering on non-timing
critical nets.
The rest of the paper is organized as follows: Section 2
formulates the slew buffering problem. Section 3 overviews the
slew buffering algorithm proposed in [2]. Section 4 describes the
new fast slew buffering algorithm without fixed input slew
assumption. Section 5 presents the experimental results with
analysis. A summary of work is given in Section 6.
2. Preliminaries
For completeness, we first introduce the slew problem as
formulated in [2]. In the slew buffering problem, we are given a
routing tree T ¼ ðV; EÞ. V consists of source vertex, sinks and
internal vertices. Each sink has sink capacitance Cs . Each edge has
lumped resistance Re and lumped capacitance Ce . We are also
given a buffer library B. Each type of buffer b has a cost Wb . At
each internal vertex, some types of buffered can be inserted. A
buffering solution is defined as a buffer assignment where buffers
are inserted at some internal locations. The cost of a buffering
P
solution g is defined as WðgÞ ¼ b2g Wb [2].
We are to compute a minimum cost buffering solution such
that the slew constraint is satisfied. The signal slew is the measure
of rising or falling time of switching. As in [2], 10
90 slew is used
which refers to the difference between the time signal waveform
crosses the 90% point and the time signal waveform crosses the
10% point. The slew model can be described by the following
generic example in [2]. Consider a path p from an upstream vertex
u to a downstream vertex v. Assume that a buffer b is inserted at u
and no buffer is inserted between u and v. Denote the output slew
of b by Sb;out ðuÞ and the slew degradation along path p by Sw ðpÞ.
The slew SðvÞ at v is computed as [10,2]
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
SðvÞ ¼ Sb;out ðuÞ2 þ Sw ðpÞ2 :
ð1Þ
As the Elmore model for delay, the slew degradation along wire
Sw ðpÞ can be computed by Bakoglu’s metric [11] as
Sw ðpÞ ¼ ln 9 DðpÞ;
ð2Þ
where DðpÞ is the Elmore delay along p. The output slew of a
buffer, such as Sb;out ðuÞ, depends on the input slew at this buffer
and its load capacitance. As in [2], the dependence is described by
a lookup table.
In [2], a fast algorithm is proposed to handle a simplified slew
buffering formulation where the input slew to each buffer is
assumed to be fixed at a conservative upper bound. This
assumption allows us to process large number of nets very
efficiently and slew constraint is satisfied. With the assumption,
the output slew of buffer b at vertex v is then given by [2]
Sb;out ðvÞ ¼ Rb CðvÞ þ Kb ;
ð3Þ
where CðvÞ is the downstream capacitance at v, Rb and Kb are
empirical fitting parameters. As in [2], we call Rb the slew
resistance and Kb the intrinsic slew of buffer b.
To illustrate the above concepts, we use the example in Fig. 1
where a neighboring pair of buffers b1 ; b2 are connected by a path
p ¼ ðvj ; vk Þ. The slew rate at vk is [2]
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ð4Þ
Sðvk Þ ¼ Sb1 ;out ðvj Þ2 þ Sw ðpÞ2 ;
where Sw ðpÞ refers to the slew degradation along the wire p, and
Sb1 ;out ðvj Þ is obtained through a 2-D look-up table when handling
non-fixed slew buffering, while it is computed by Eq. (3) when
handling fixed input slew buffering.
The slew buffering problem is formulated in [2] as follows.
Slew constrained minimum cost buffer insertion problem: Given a
routing tree T ¼ ðV; EÞ, possible buffer positions, and a buffer
library B, compute a buffering solution g such that the total cost
WðgÞ is minimized and the slew constraint a is satisfied.
3. Overview of [2]’s minimum cost slew buffering algorithm
assuming fixed input slew
The algorithm proposed in [2] works under the dynamic
programming framework as timing buffering [12,3]. For completeness, we include the algorithm in [2] as follows.
In the algorithm, a set of candidate solutions are propagated
from the sinks toward the source. Each buffering solution g is
characterized by ðC; W; SÞ, where C denotes the downstream
capacitance at the current node, W denotes the cost of the
solution and S is the cumulative slew degradation along wire. S is
Sw defined in Eq. (2). The solution at a sink node has C as sink
capacitance, W ¼ 0 and S ¼ 0. During solution propagation, we
will perform ‘‘add wire’’, ‘‘add buffer’’ and ‘‘merge branch’’.
‘‘Add wire’’: To propagate a solution gv from a node v to its
parent node u, a solution gu is generated at u as follows. Cðgu Þ ¼
Cðgv Þ þ Ce ; Wðgu Þ ¼ Wðgv Þ and Sðgu Þ ¼ Sðgv Þ þ ln 9 De where
De ¼ Re ðCe =2 þ Cðgv ÞÞ.
Fig. 1. Slew rate computation.
Please cite this article as: S. Hu, J. Hu, A fast general slew constrained minimum cost buffering algorithm, Microelectron. J (2009),
doi:10.1016/j.mejo.2009.08.003
ARTICLE IN PRESS
S. Hu, J. Hu / Microelectronics Journal ] (]]]]) ]]]–]]]
‘‘Add buffer’’: To insert a buffer bi at u to gu , a new solution gu;buf
is generated as follows. Cðgu;buf Þ ¼ Cbi ; Wðgu;buf Þ ¼ Wðgu Þ þ Wbi and
Sðgu;buf Þ ¼ 0.
‘‘Merge branch’’: At a branching node, two sets of solutions
are merged. Denote by Gl the left-branch solution set and by Gr
the right-branch solution set, respectively. For each solution gl 2
Gl and each solution gr 2 Gr , the merged solution g0 is generated
as follows. Cðg0 Þ ¼ Cðgl Þ þ Cðgr Þ; Wðg0 Þ ¼ Wðgl Þ þ Wðgr Þ and Sðg0 Þ ¼
maxfSðgl Þ; Sðgr Þg.
The following pruning technique is performed to accelerate the
approach in [2]. For any two solutions g1 ; g2 at the same node, g1
dominates g2 if Cðg1 ÞrCðg2 Þ, Wðg1 ÞrWðg2 Þ and Sðg1 ÞrSðg2 Þ. A
solution is pruned if it is dominated by other solutions, or its
cumulative slew degradation along wire SðgÞ is greater than the
slew constraint a or the slew of any downstream buffer in g is
greater than a.
4. A new fast algorithm without fixed input slew assumption
4.1. Problem of the approach in [2]
In the above slew buffering algorithm, the buffer output slew is
independent of its input slew since the input slew for each buffer
is assumed to be fixed at a conservative upper bound. In this way,
the feasible solutions returned by the above algorithm guarantee
satisfying slew constraints. On the other hand, it is clear that if we
eliminate the fixed-input slew assumption, there may be
improvement in buffer area. This is demonstrated by experiments
in [2] where 20% buffer savings can be obtained.
To handle slew buffering without fixed input slew assumption,
[2] uses a much more complicated dynamic programming
algorithm. Their idea is to first discretize every possible input
slew into slew bins and carry out the fixed input slew buffering
algorithm with each bin. Since the solutions associated with a
slew bin may switch to other slew bins, a large amount of
solutions can be generated. This shows significant impact on the
algorithm’s efficiency in spite of the attempt made in [2] for
speeding up the approach (through reducing part of the problem
to a maximum bipartite matching problem for efficient solutions).
Their experiments show that the algorithm is even slower
than timing driven buffering in many cases. This is clear from
Table 1 [2].
4.2. The new algorithm
To handle non-fixed input slew case, instead of building
buffering solutions from scratch, one might wonder whether we
can start from a suboptimal yet easy-to-compute slew buffering
3
solution and improve it through some heuristics. This motivates
this work. Actually, such a solution can be easily obtained, which
is just the one returned by the slew buffering algorithm with fixed
input slew assumption as described in Section 3.
For convenience, we call such a solution an initial solution.
Using slew lookup table, a top-down slew rate evaluation is first
performed on the initial solution. It is possible that the input
slews to some inserted buffers are much smaller than the slew
constraints. Thus, we can try to replace some buffers with cheaper
(smaller-cost) buffers. Buffer area saving is achieved if the slew
constraint is still satisfied after the buffer replacement.
This idea can be further generalized as follows. We do not need
to restrict buffer replacement to be applied only at initially
buffered positions, i.e., those positions with buffers inserted in the
initial solution. More positions can be tried. Since our new
algorithm is guided by the fixed input slew buffering algorithm,
such positions should be those around the initially buffered
positions.
Assume that a buffer is inserted at node v in the initial
solution, then its immediate upstream and downstream 7P
possible buffer positions will be investigated. We will first remove
the buffer at v from the initial solution and tentatively insert a
cheaper buffer into each of these 2P þ 1 positions. If the resulting
solution is feasible (i.e., still satisfying the slew constraint), it will
be recorded. The recorded solution with minimum cost will be
selected as the buffer replacement solution for these 2P þ 1
positions. If there is no feasible buffer replacement, the initial
solution will be restored. Since the upstream knowledge is
available to us, the slew lookup table can be used for evaluating
slew rate. Note that after processing any initially buffered
position, only a single solution is maintained.
Our algorithm is now clear. After computing the initial
solution, a local improvement heuristic is performed. For this, we
compute a post-order traversal on the initially buffered positions.
For each initially buffered position, perform the above buffer
replacement process on those 2P þ 1 positions. After selecting the
best buffer replacement solution for them, we proceed to the next
initially buffered position along the post-order traversal. The
process terminates after the last buffered position is processed.
Since for positions around each initially buffered position, only
the solution with the minimum cost is selected (which certainly
has no greater cost than the initial solution), the final solution is
guaranteed to have no greater cost than the initial solution. In
other words, this local improvement heuristic can never degrade
the solution quality.
Let us look at a simple example illustrating the above
algorithm. Refer to Fig. 2. In this tree, the initially buffered
positions are a; b; c and they will be processed in this order.
Suppose that P is set to 1. When a is processed, two adjacent
possible buffer positions and a itself will be investigated. For each
Table 1
Results in [2].
Slew
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Fixed input slew buffering (SB)
Non-fixed slew buffering (SB þ NI)
Slew constrained timing buffering (VGL þ S)
Area
#Buf
CPU (s)
Area
#Buf
CPU (s)
Area
#Buf
CPU (s)
44,980
30,963
22,960
18,380
15,531
13,340
11,578
10,316
7794
6069
5108
4114
3551
3216
2972
2712
19.1
15.0
11.7
9.5
8.3
7.5
6.9
6.2
35,148
25,018
19,797
16,528
13,995
12,129
10,667
9629
7114
5666
4326
3772
3463
3145
2854
2488
992.1
931.7
762.8
569.3
473.6
397.4
365.2
337.3
46,551
32,133
24,235
19,438
16,445
14,218
12,243
10,897
9605
7600
6858
5504
4565
4300
3749
3340
346.5
351.8
408.1
417.8
463.2
487.1
532.9
548.9
Slew refers to slew constraint in ns.
Please cite this article as: S. Hu, J. Hu, A fast general slew constrained minimum cost buffering algorithm, Microelectron. J (2009),
doi:10.1016/j.mejo.2009.08.003
ARTICLE IN PRESS
4
S. Hu, J. Hu / Microelectronics Journal ] (]]]]) ]]]–]]]
tree. This is due to that we need to perform slew evaluation at the
node and all its downstream part to check whether there are slew
violations. Therefore, ð2P þ 1ÞjBj slew validations need OðnPjBjÞ
time. Denote by m (mrn) the number of initially buffered
positions, the runtime for the whole algorithm is bounded by
OðmnPjBjÞ time. As P is a constant, the runtime becomes OðmnjBjÞ.
Note that in practice, the number of inserted buffers m is certainly
much smaller than n. Thus, we can practically treat the above
bound as OðnjBjÞ. That is, our algorithm is a linear time algorithm
in terms of the tree size. The pseudo-code for the proposed slew
buffering algorithm is shown in Fig. 3.
c
5. Experimental results
a
b
Fig. 2. An illustration of the local improvement heuristic: square: candidate buffer
position and triangle: buffer.
5.1. Experiment setup
To demonstrate the effectiveness and the efficiency of the new
algorithm, we compare it with the approaches presented in [2].
For convenience, all algorithms in comparison are listed below
together with their abbreviations.
SB: Slew buffering algorithm with fixed input slew in [2].
SB þ NI: Slew buffering with non-fixed input slew in [2].
NEW: New slew buffering with non-fixed input slew.
VGL þ S: Slew constrained van Ginneken/Lillis’ minimum cost
timing buffering algorithm in [2].
For a fair comparison, our algorithm is tested on the same set of
1000 industrial nets as in [2], which are extracted from an
industrial ASIC chip. The sink capacitances range from 2.5 to
200 fF. The wire resistance is 0:56 O=mm and the wire capacitance
is 0:48 fF=mm. The buffer library consists of 48 buffers, in which 23
are non-inverting buffers and 25 are inverting buffers. Normalized
buffer areas range from 5 to 34, slew resistances range from 0.18
to 29.3 ns/pF, and input capacitances range from 2.1 to 76.0 fF.
5.2. Comparisons
Fig. 3. Slew constrained minimum cost buffering algorithm with non-fixed buffer
input slew.
of the three positions, all buffers with cost smaller than the buffer
cost in the initial solution (at a) will be tried. The minimum cost
feasible solution will be finally selected after investigating all
these three positions. b, c are then treated similarly. Note that in
our algorithm, a restriction is further imposed: we require that all
positions to be investigated must be along the same branch with
the initially buffered position. For example, when P ¼ 2, the
position beyond the branching point is a position within 2P þ 1
around a. It will not be investigated as inserting buffer there will
significantly change the initial solution (i.e., a buffer at a is
removed while a buffer is inserted above the branching point). It
should be avoided as our approach is guided by the initial
solution. This strategy simplifies our task while still allows us to
obtain high quality solutions as indicated by our experiments.
As regards to the time complexity of the local improvement
heuristic, at most ð2P þ 1ÞjBj slew validation (i.e., checking
whether there is slew violation) are needed around each initially
buffered position, where jBj is the number of buffer types in the
buffer library. Each slew validation can be performed in OðnÞ time
where n denotes the number of candidate buffer positions in the
For convenience, we first collect the results from [2] to Table 1.
Comparison of NEW with SB, SB þ NI and VGL þ S are summarized
in Table 2. Denote by ‘‘area saving’’ the percentage difference in
area and by ‘‘speed up’’ the percentage difference in CPU time
(seconds). The slew constraint is given in nanoseconds. In the
experiment, P is set 1. Thus, three positions around each initially
buffered position are investigated. As indicated in Section 4.2,
these three positions must be along the same branch, and any
position crossing the branching point will not be investigated
since inserting a buffer there would significantly impact the initial
Table 2
Results of new slew buffering algorithm with non-fixed input slew.
Slew
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
New
Ratios
Area
#Buf
CPU
Area saving
ð%Þ
SB
Speedup
SB þ NI
Speedup
VGL þ S
40,128
28,327
21,761
17,790
15,017
12,951
11,285
10,087
7794
6069
5108
4114
3551
3216
2972
2712
31.0
25.2
22.1
17.7
15.2
15.1
11.8
11.2
10.8
8.5
5.2
3.2
3.3
2.9
2.5
2.2
32
37
35
32
31
26
31
30
11
14
18
24
30
32
45
49
Slew refers to slew constraint in ns.
Please cite this article as: S. Hu, J. Hu, A fast general slew constrained minimum cost buffering algorithm, Microelectron. J (2009),
doi:10.1016/j.mejo.2009.08.003
ARTICLE IN PRESS
S. Hu, J. Hu / Microelectronics Journal ] (]]]]) ]]]–]]]
5
from the solution set due to slew violation [2]. It is also clear
that NEW often saves 410% area over VGL þ S.
4.5
Since both local improvement heuristic and SB (see [2]) run in
linear time in practice, NEW virtually runs in linear time with the
number of buffer positions. This is demonstrated in Fig. 4 where a
log-log plot on the number of buffer positions and CPU time is
shown. The best linear fit has slope of 1.02.
log (CPU Time)
4
3.5
3
2.5
6. Conclusion
2
1.5
1
0.5
12
12.5
13
13.5
14
14.5
15
15.5
log (# of buffer positions)
Fig. 4. ‘‘þ’’: log2 of buffer positions vs. log2 CPU time for NEW algorithm for the
slew constraint set to 0.5 ns. The best linear fit is shown.
solution. Since in our test cases, often there are no more than
three candidate buffer positions along the same branch, P ¼ 1 is a
reasonable choice. We make the following observations:
Comparing NEW with SB, up to 10.8% buffer area saving is
5
obtained by NEW with small amount of additional runtime.
The computation overhead due to local improvement heuristic
can be easily obtained by subtracting the runtime between
NEW and SB. NEW and SB use the same number of buffers,
which is the case since NEW only performs buffer replacement
to the solutions obtained by SB and no buffers are further
inserted or removed.
Compared to SB þ NI, NEW can be 37 faster. This is due to the
fact that the local improvement heuristic can be efficiently
performed and after it, only a single solution is selected while
in SB þ NI, the slew bin structure is used and solutions
constantly switch bins so that a huge amount of solutions
have to be propagated. As NEW runs much faster than SB þ NI,
the slight sacrifice in buffer usage becomes acceptable. NEW
overcomes the hurdle of SB þ NI for non-fixed slew buffering
because it runs very fast. This matches our purpose for using
slew buffering: slew buffering is for non-timing-critical nets,
and thus runtime is the main issue.
Comparing NEW with VGL þ S, NEW can be 49 faster. Note
that VGL þ S runs slower with larger slew constraint. This is
due to that VGL þ S is a timing buffering and thus solution
domination/pruning is based on slack but not on slew, and
with larger slew constraint, fewer solutions can be eliminated
Slew buffering can be used to efficiently buffer non-timingcritical nets and thus accelerate the whole physical synthesis flow.
However, the existing slew buffering algorithm proposed in [2] for
handling non-fixed input slew runs slower than timing buffering
which makes it less used in practice. This paper proposes a new
fast slew buffering algorithm. Experimental results demonstrate
that new algorithm can run up to 49 faster than [2] and up to
37 faster than the widely used timing buffering algorithm with
about 10% area saving.
References
[1] S. Hu, J. Hu, A new fast slew buffering algorithm without input slew
assumption, in: IEEE Dallas Circuits and Systems Workshop, 2007.
[2] S. Hu, C. Alpert, J. Hu, S. Karandikar, Z. Li, W. Shi, C.-N. Sze, Fast algorithms for
slew constrained minimum cost buffering, IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems 26 (11) (2007) 2009–2022.
[3] J. Lillis, C.-K. Cheng, T.-T.Y. Lin, Optimal wire sizing and buffer insertion for
low power and a generalized delay model, IEEE Journal of Solid State Circuits
31 (3) (1996) 437–447.
[4] V. Adler, E. Friedman, Repeater design to reduce delay and power in resistive
interconnect, IEEE Transactions on Circuits and Systems II: Analog and Digital
Signal Processing 45 (5) (1998) 607–616.
[5] Y. Ismail, E. Friedman, Effects of inductance on the propagation delay and
repeater insertion in VLSI circuits, IEEE Transactions on Very Large Scale
Integration (VLSI) Systems 8 (2) (2000) 195–206.
[6] V. Adler, E. Friedman, Uniform repeater insertion in RC trees, IEEE
Transactions on Circuits and Systems I: Fundamental Theory and Applications
47 (10) (2000) 1515–1523.
[7] Y. Ismail, E. Friedman, J. Neves, Repeater insertion in tree structured inductive
interconnect, IEEE Transactions on Circuits and Systems II: Analog and Digital
Signal Processing 48 (5) (2001) 471–481.
[8] P.J. Osler, Placement driven synthesis case studies on two sets of two chips:
hierarchical and flat, in: Proceedings of the ACM International Symposium on
Physical Design, 2004, pp. 190–197.
[9] Y. Peng, X. Liu, Low-power repeater insertion with both delay and slew rate
constraints, in: Proceedings of ACM/IEEE Design Automation Conference,
2006, pp. 302–307.
[10] C.V. Kashyap, C.J. Alpert, F. Liu, A. Devgan, Closed form expressions for
extending step delay and slew metrics to ramp inputs, in: Proceedings of the
International Symposium on Physical Design, 2003, pp. 24–31.
[11] H.B. Bakoglu, Circuits, Interconnects, and Packaging for VLSI, Addison-Wesley
Publishing Company, Reading, MA, 1990.
[12] L.P.P.P. van Ginneken, Buffer placement in distributed RC-tree networks for
minimal Elmore delay, in: Proceedings of the IEEE International Symposium
on Circuits and Systems, 1990, pp. 865–868.
Please cite this article as: S. Hu, J. Hu, A fast general slew constrained minimum cost buffering algorithm, Microelectron. J (2009),
doi:10.1016/j.mejo.2009.08.003