Effort Delay - UC Davis ECE

Logical Effort Basics
from Bart Zeydel
Logical Effort Components
vcc
vcc
Wp/Lp
In
Out
in
out
Wp
Wp
Wn/Ln
in
out
vss
Wn
Ln
Wn
vss
Template
Ln
Template Width
Scaled by 
• Input Capacitance increases by  · Ctemplate
• Resistance decreases by Rtemplate / 
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
2
Logical Effort Input Capacitance
Cox  eox / tox (unit: F/m2)
Leff = L – 2xd
Gate
Oxide
Polysilicon
Cgate = CoxWLeff = CoxWL(1 – 2xd / L)
k1 = Cox(1 – 2xd / L)
Ctemplate(inv) = k1WnLn + k1WpLp
xd
xd
Leff
Cin(inv) =  · Ctemplate(inv)
• Input Capacitance is the sum of the gate capacitances
• Using LE,  denotes size in multiples of the template
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
3
Logical Effort Resistance
R = (r / t) (L / W) ohms
R = Rsheet (L / W) ohms
W
L
Rchannel = Rsheet (L / W) ohms
Channel
Rsheet = 1 / ( mCox ( Vgs – Vt )) ohms
k2 = Cox (Vgs – Vt)
t
W
L
m = surface mobility
Rchannel = L / (k2 m W) ohms
Rup(inv) = Rup-template(inv) / 
Rdown(inv) = Rdown-template(inv) / 
• Resistance is dependent on channel width and process
• Larger values of  result in lower resistance.
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
4
Logical Effort Parasitics
W
Xc
Cja
Cjp
W
Ldiff
junction capacitance per m2
periphery capacitance per m
width of diffusion region m
length of diffusion region m
C j  C ja W  Ldiff  C jp  (2  W  2  Ldiff )
Ldiff
W
Xc
Ldiff
Cjp
Cja
• Larger values of  result in increased parasitic cap,
however R decreases at same rate, thus constant RC delay.
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
5
RC Model for CMOS logic gate
Ru
Out
In
Cin
Fall 2004
Rd
Cp
Prof. V. G. Oklobdzija: High-Performance System Design
Cout
6
LE Delay derivation for step input
dVout
 Id  C
dt
I d ( sat)
I d (lin)
W
 m n  Cox 
L

Vdd
C
W (VGS  Vt ) 2
 m n  Cox 
L
2
Vdd Vt
tf  
Id
Vout
C
I d ( sat)
2


V
 (VGS  Vt ) 2  VD  D 

2 

dV 
Vdd / 2

Vdd Vt
C
I d (lin)
Out
dV
tf
In
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
7
LE Model derivation (cont.)
t f ( sat)
C

W (VDD  Vt ) 2
mn  Cox 
L
2
t f (lin)
tf 
W
  m n  Cox 
L
Vdd Vt
 dv 
Vdd
2Vt C
W
mn  Cox  (VDD  Vt ) 2
L
2


V
2
OUT
 (VDD  Vt )  VOUT 
dv


2 
Vdd Vt 
Vdd / 2
 2  Vt
4  (VDD  Vt )  VDD 
C


 ln 
W
VDD

m n  Cox  (VDD  Vt )  VDD  Vt
L
Substituting in R and C to obtain
 2  Vt

Vt  
 
t f  R  C  
 ln  3  4

V

V
V
t
DD  

 DD
k
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
8
Logical Effort Gate Delay Model
vcc
Cin    Ct
vcc
in
out
Wp
Wp
in
R
Rt

out
vss
Wn
Ln
Wn
vss
Gate Template
Ln
 Scaled Gate
t f  k  R(Cout  C p )
 Cout
 Rt 
 k       Ct  
 
   Ct

Rt 

  k     (  C p )
 

 Cout 
  k  Rt  C p
 (k  Rt  Ct )
 Cin 
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
9
Logical Effort Gate Delay Model
Simplify Analysis by Normalizing to Inverter Template
 Rt  Ct  Cout   Rt  C p 

  

t f  k  RinvCinv 
 RinvCinv  Cin   RinvCinv 
Doesn’t change
with 
  k  Rinv  Cinv  k  Rt  Ct
Rt Ct
g
RinvCinv
Doesn’t change
with 
Cout
h
Cin
p
Rt C p
RinvCinv
t f    ( g  h  p)
Same derivation can be performed for tr
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
10
Estimating g and p of Gates
Inverter
NAND2
2 b
a
2
2
a
NOR2
q
q
1
a
2
b
2
a
4
b
4
q
a
1 b
1
g = (2+1)/(2+1)
=1
g = (2+2)/(2+1)
= 4/3
g = (4+1)/(2+1)
= 5/3
p = Cp-inv/Cinv = pinv
p = [(2+2+2)/(2+1)]pinv
= 2pinv
p = [(4+1+1)/(2+1)]pinv
= 2pinv
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
11
130nm Delay of Gates vs. h
60
50
Delay (ps)
40
30
20
Slope
   7.3ps
10
0
0
1
2
3
4
5
6
Electrical Effort: h
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
12
Normalized Delay of Gates vs. h
8
  7.3ps
Normalized Delay ()
7
6
5
4
3
Effort Delay
2
1
Parasitic Delay
0
0
1
2
3
4
5
6
Electrical Effort: h
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
13
Normalized Delay Estimate of Gates vs. h
Note: To simplify analysis
Assume Cp-inv = Cinv
8
Normalized Delay: d
7
6
5
4
3
Effort Delay
2
1
Parasitic Delay
0
0
1
2
3
4
5
6
Prof. V. G. Oklobdzija: High-Performance System Design
14
Electrical Effort: h
Fall 2004
LE Path Delay
Input Capacitance
Logical Effort:
Parasitic Delay
Stage Effort:
Gate
1
Gate
2
Cin
g1
p1
f1
C2
g2
p2
f2
Gate
3
Cout
C3
g3
p3
f3
D  [( g1h1  p1 )  ( g 2 h2  p2 )  ( g3h3  p3 )] 
• Any n-stage path can be described using Logical Effort
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
15
LE Path Delay Optimization
D  [( g1h1  p1 )  ( g 2 h2  p2 )  ( g3h3  p3 )] 
h1 
C2
Cin
h2 
C3
C2
h3 
Cout
C3
C3
Cout
C2
D  [( g1
 p1 )  ( g 2
 p2 )( g 3
 p3 )] 
Cin
C2
C3
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
16
LE Path Delay Optimization (cont.)
By Definition, Cin and Cout are fixed.
C3
Cout
C2
D  [( g1
 p1 )  ( g 2
 p2 )  ( g 3
 p3 )] 
Cin
C2
C3
Solve for C2 and C3:
C3
D
g1

 g2 2  0
C2 Cin
C2
C3
C2
g1
 g2
Cin
C2
C
D g 2

 g 3 out2  0
C3 C2
C3
g2
f1  f 2
C3
C
 g 3 out
C2
C3
f 2  f3
Minimum delay occurs when stage efforts are equal
f1  f 2  f 3  f opt
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
17
Simplified Path Optimization
We want the effort of each stage to be equal.
F  f1  f 2  f 3 ... f n1  f n = Path Effort
f opt  F 1/ n
= Stage Effort
To quickly solve for F:
F  g1h1  g 2 h2  g3h3 ...g n1hn1  g n hn
Cout
H  h1  h2  h3 ...hn 1  hn 
Cin
G  g1  g 2  g3 ...g n1  g n
F  G H
Fall 2004
= Logical Effort
of the path
Prof. V. G. Oklobdzija: High-Performance System Design
18
Delay Optimization Example
C2
Cin = 1
C3
C4
81
g
f = gh
Ci
Gate 1
1
Gate 2
1
Gate 3
1
Gate 4
1
1
1
1
1
1
1
81
1
Total Delay = (84 + 4pinv)
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
19
Delay Optimization Example
C2
Cin = 1
C3
C4
81
g
f = gh
Ci
Gate 1
1
Gate 2
1
Gate 3
1
Gate 4
1
1
1
1
1
40.5
1
2
40.5
Total Delay = (44.5 + 4pinv)
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
20
Delay Optimization Example
C2
Cin = 1
C3
C4
81
g
f = gh
Ci
Gate 1
1
Gate 2
1
Gate 3
1
Gate 4
1
1
1
10.1
1
4
10.1
2
40.5
Total Delay = (17.1 + 4pinv)
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
21
Delay Optimization Example
C2
Cin = 1
C3
C4
81
g
f = gh
Ci
Gate 1
1
Gate 2
1
Gate 3
1
Gate 4
1
10.1
1
1
10.1
4
10.1
2
40.5
Total Delay = (17.1 + 4pinv)
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
22
Optimal Sizing for Delay
C2
Cin = 1
C3
C4
81
g
f = gh
Ci
Gate 1
1
Gate 2
1
Gate 3
1
Gate 4
1
3
1
3
3
3
9
3
27
Optimal Delay = (12 + 4pinv)
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
23
Delay Optimization and Sizing Example
C2
Cin = 1
C3
C4
n
 n

D    f i   pi  
i 1
 i 1

81
  C1 C2 C3 Cout  4 
   pi  
D   g inv 



  Cin C1 C2 C3  i 1 
Use Logical Effort to optimize sizes for Delay
F  f1  f 2  f 3  f 4
4
F   g i hi  81
f opt  4 81  3
i 1
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
24
Delay Optimization and Sizing Example
Cin = 1
C2
C3
C4
81
Size from output to input using fopt
Cout
gCout
f g
 Cin 
Cin
f
C4  (81 ginv ) / 3  27
C3  (27  ginv ) / 3  9
C2  (9  ginv ) / 3  3
C1  (3  ginv ) / 3  1
Delay Estimate
4
 4

D    3   pinv    12  4 pinv  
i 1
 i 1

Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
25
Example 2: Path Optimization
C2
Cin = 1
C4
C3
81
g
f = gh
Ci
Size
Gate 1
1
1
1
1
Gate 2
5/3
Gate 3
4/3
Gate 4
1
5/3*3/5 4/3*3/4
4/3
5/3
1
1
81
1
1
Total Delay = (85 + 6pinv)
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
26
Example 2: Path Optimization
C2
Cin = 1
C3
C4
81
g
f = gh
Ci
Size
Gate 1
1
Gate 2
5/3
3.66
1
3.66
3.66
1
2.2
Gate 3
4/3
3.66
8.06
6.04
Gate 4
1
3.66
22.1
22.1
Optimal Delay = (14.64 + 6pinv)
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
27
Example 2: LE Solution
Cin = 1
C2
C3
C4
81
4
F   g i hi  180
i 1
f opt  4 180  3.66
Size from output to input using fopt
C4  (81 ginv ) / 3.66  22.1
C3  (22.1 g nand 2 ) / 3.66  8.06
Size  Ci / gi
Delay Estimate
C2  (8.06  g nor 2 ) / 3.66  3.66
C1  (3.66  ginv ) / 3.66  1
S1 = 1, S2 = 2.2, S3 = 6.04, S4 = 22.1
4
 4

D    3.66   pi    14.64  6 pinv  
i 1
 i 1

Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
28
Logical Effort for Multi-path
f0
f1
f2
Gate
0
Gate
1
Gate
2
Path a
Cout1
Cin
Gate
3
Gate
4
Gate
5
Cout2
Path b
f3
From LE,
f 0  f1  f 2
f4
and
f5
f3  f 4  f5
Minimum delay occurs when Da = Db or Fa = Fb (ignoring parasitics)
f 0  f1  f 2  f 3  f 4  f 5
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
29
Logical Effort for Multi-path (cont.)
Da  [( g 0 h0  p0 )  ( g1h1  p1 )  ( g 2 h2  p2 )]  
Db  [( g 3 h3  p3 )  ( g 4 h4  p4 )  ( g 5 h5  p5 )]  
Cout1
Fa  g 0 g1 g 2 
C0
Cout 2
Fb  g 3 g 4 g 5 
C3
g g g C
C0  0 1 2 out1
Fa
g 3 g 4 g 5  Cout 2
C3 
Fb
Branching:
Ratio of total capacitance to on-path capacitance
C o  C3
ba 
C0
Fall 2004
C3  C 0
bb 
C3
Prof. V. G. Oklobdzija: High-Performance System Design
30
Logical Effort for Multi-path (cont)
Substituting C0 and C3
g 0 g1 g 2  C out1 g 3 g 4 g 5  C out 2

Fa
Fb
ba 
g 0 g1 g 2  C out1
Fa
Since Minimum Delay occurs when Da = Db or Fa = Fb
g 0 g1 g 2  Cout1  g 3 g 4 g 5  Cout 2
ba 
g 0 g1 g 2  Cout1
Similarly
g 3 g 4 g 5  Cout 2  g 0 g1 g 2  Cout1
bb 
g 3 g 4 g 5  Cout 2
• Branching = 2 when Ga = Gb and Cout1 = Cout2
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
31
Example 3: Uniform Branching
Cout2 = C5+C3
C6
C5
81
Cin = 1
C2
C4
C3
g
f = gh
Ci
b
Gate 1
1
1
Gate 2
5/3
10/3
1
1
1
2
81
Gate 3,5
4/3
Gate 4,6
1
4/3
1
81
1
1
1
Total Delay = (86.67 + 6pinv)
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
32
Example 3: Uniform Branching
Cout2 = C5+C3
Cin = 1
g
f = gh
Ci
b
C6
C5
81
C2
C3
C4
Gate 1
1
4.36
1
Gate 2
5/3
4.36
4.36
Gate 3,5
4/3
4.36
Gate 4,6
1
4.36
1
2
5.69
1
18.6
1
81
Optimal Delay = (17.44 + 6pinv)
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
33
Example 3: LE Solution
Cout2 = C5+C3
C6
C5
81
C2
Cin = 1
C4
C3
F  GBH  360
81
f opt  4 360  4.36
Size from output to input using fopt
C6  C4  (81 ginv ) / 4.36  18.6
C2  (2  5.69  g nor2 ) / 4.36  4.36
C5  C3  (18.6  g nand 2 ) / 4.36  5.69
C1  (4.36  ginv ) / 4.36  1
Delay Estimate
4
 4

D    4.36   pi    17.44  6 pinv  
i 1
 i 1

Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
34
Complex Multi-path Optimization
If each path has internal branching, ba and bb are as follows
g 0 b0 g1b1 g 2 b2  Cout1  g 3b3 g 4 b4 g 5 b5  Cout 2
ba 
g 0 b0 g1b1 g 2 b2  Cout1
g 3b3 g 4 b4 g 5 b5  Cout 2  g 0 b0 g1b1 g 2 b2  Cout1
bb 
g 3b3 g 4 b4 g 5 b5  Cout 2
• Note: This solution and previous solution differ
from that described in LE book (which is incorrect)
Fall 2004
Prof. V. G. Oklobdzija: High-Performance System Design
35