ICCAD2000 3A.3S

Provably Good Global Buffering Using an
Available Buffer Block Plan
F.F. Dragan (Kent State)
A.B. Kahng (UCSD)
I. Mandoiu (Georgia Tech/UCLA)
S. Muddu (Silicon Graphics)
A. Zelikovsky (Georgia State)
Global Buffering via Buffer Blocks
• VDSM  buffer / inverter insertion for all global nets
– 50nm technology  10^6 buffers
• Buffer Block (BB) methodology
– isolate buffer insertion from block implementations
– improve routing area resources (RAR) utilization
RAR(2k-buffer block) =   RAR(k-buffer block)
For high-end designs   1.6
• Buffer block planning [Cong+99] [TangW00]
– given block placement + nets
– find shape and location of BBs
• Global buffering via BBs
– given nets + BB locations and capacities
– find buffered routing for each net
Global Buffering via Buffer Blocks
Global Buffering Problem
Given:
• Pin & BB locations, BB capacities
• list of 2-pin nets, each net has
• upper-bound on #buffers
• parity requirement on #buffers
• [non-negative weight (criticality coefficient)]
• L/U bounds on wirelength b/w consecutive buffers/pins
Find: buffered routing of a maximum [weighted] number of
nets subject to the given constraints
Global Buffering Problem
Given:
• Pin & BB locations, BB capacities
• list of 2-pin nets, each net has
• upper-bound on #buffers
new
• parity requirement on #buffers
new
• [non-negative weight (criticality coefficient)] new
• L/U bounds on wirelength b/w consecutive buffers/pins
Find: buffered routing of a maximum [weighted] number of
nets subject to the given constraints
Previous work: 1 buffer per connection, no weights
Outline of Results
•
•
Provably good algorithm for the Global Buffering Problem
–
integer node-capacitated multi-commodity flow (MCF) formulation
–
approximation algorithm for solving fractional relaxation
–
provably good randomized rounding based on [RaghavanT87]
–
allows tradeoff between run-time and solution quality
Fast heuristic based on ideas from the approximation algorithm
•
superior to simpler greedy approaches
•
almost matches the provably good algorithm for loosely constrained
instances
Integer Program Formulation
Graph G  (V , E ) :
V  pins  Buffer Blocks
E  { (u,v) : dist(u,v)  [ L,U ] }
max
 f ( p)
pP
 f ( p)
 cap(u )
u V
f ( p )  {0,1}
P  set of legal routing paths
pP
s.t.
u p
For any pin u, cap(u ) is set to 1
 each net routed at most once
High-Level Approach
•
•
Solve fractional relaxation + rounding
–
first introduced for global routing [RaghavanT87]
–
fractional relaxation = node-capacitated multi-commodity flow (MCF)
•
can be solved exactly using Linear Programming (LP) techniques
•
exact LP algorithms are not practical for large instances
Key idea: approximate solution to the relaxation
–
we generalize edge-capacitated MCF approximation of [GargK98, F99]
–
[GargK98] successfully applied to global routing by [Albrecht00]
Approximating the Fractional MCF
-MCF algorithm
w(v) = , f = 0
For i = 1 to N do
For k = 1, …, #nets do
Find a shortest path p  P for net k
While w(p) < min{ 1, (1+2)^I } do
f(p)= f(p) + 1
For every v  p do
w(v)  ( 1 + /c(v) ) * w(v)
End For
End While
End For
End For
Output f/N
Run time for -approximation = O( 
2
(# nets  # BBs ) 2 )
Rounding to an Integer Solution
• Random walk algorithm [RaghavanT87]
– probability of routing a net proportional to net’s flow
– probability of choosing an arc proportional to fractional
flow along arc
– run time = O( #inserted buffers )
– To avoid BB overuse, scale-down fractional flow by 1-
before rounding
• Modifications
• approximate MCF underestimates optimum
 few violations + unused BB capacity for large 
• resolve capacity violations by greedily deleting paths
• greedily route remaining nets using unused BB capacity
Implemented Heuristics
• -MCF w/ greedy enhancement
– solve fractional MCF with  approximation
– round fractional solution via random walks
– apply greedy deletion/addition to get feasible solution
• Greedy
– sequentially route nets along shortest available paths
• 1-shot integer MCF
– assign weight w=1 to each BB
– repeat until total overused capacity does not decrease
• for each net find shortest path
• for each BB r increase weight by factor (1 +  usage(r) / cap(r))
– apply greedy deletion/addition to get feasible solution
Experimental Setup
•
•
•
•
•
Test instances extracted from next-generation SGI
microprocessor
~4,000 nets
U=4,000 m, L=500-2,000 m
50 buffer blocks
BB capacity
– 400 (fully routable instances)
– 50 (hard instances, 50-60% routable)
% Routed Nets
Fully Routable Instance (4212 nets)
100
98
96
.16
MCF
.08
MCF
w/o Greedy Addition
.04
MCF
.02
MCF
w/ Greedy Addition
.01
MCF
% Routed Nets
Fully Routable Instance (4212 nets)
100
1-Shot
98
Greedy
96
.16
MCF
.08
MCF
w/o Greedy Addition
.04
MCF
.02
MCF
w/ Greedy Addition
.01
MCF
Running Time vs. Solution Quality
% Routed Nets
100
99
98
97
0
30
60
90
CPU Seconds
Greedy
1-Shot
.16-MCF
120
% Routed Nets
~57% Routable Instance (4212 nets)
56
1-Shot
54
Greedy
52
.16 MCF .08 MCF .04 MCF .02 MCF .01 MCF
w/o Greedy Addition
w/ Greedy Addition
Conclusions and Ongoing Work
• Provably good algorithm based on nodecapacitated MCF approximation
• Extensions:
– combine global buffering with BB planning
• combine with compaction
Combining with compaction
Combining with compaction
Combining with compaction
• Sum-capacity constraints: cap(BB1) + cap(BB2)  const.
Conclusions and Ongoing Work
• Provably good algorithm based on nodecapacitated MCF approximation
• Extensions:
– combine global buffering with BB planning
• combine with compaction
– enforce channel capacity constraints
– multi-terminal nets (ASPDAC-01)