An Efficient Wake-Up Strategy Considering Spurious Glitches

An Efficient Wake-Up Strategy Considering Spurious
Glitches Phenomenon for Power-Gating Designs
Da-Cheng Juan, Yu-Ting Chen, Ming-Chao Lee, and Shih-Chieh Chang, Member, IEEE
Abstract— During the power mode transition, simultaneously
turning on sleep transistors provides a sufficiently large surge
current, which may cause a large IR drop in the power
networks. The IR drop in turn causes errors in the retention
sequential elements of the sleep modules or errors of the
non-sleep modules. One efficient way to control the surge
current is to schedule the turn-on sequences of sleep
transistors. In this paper, we introduce several important
properties of the surge current during the power mode
transition for the Distributed Sleep Transistor Network
(DSTN) design, which is a popular power-gating design style.
Based on these properties, we propose an accurate estimation
of the surge current and provide efficient schedules on the
DSTN structure. Our methods achieved significantly better
results than previous works—on average, 261 times wake-up
time reduction and 30% less energy loss during the power
mode transition.
Index Terms – low power, leakage, multi-threshold CMOS
(MTCMOS), power gating, power mode transition, schedule.
I.
VDD
Low Vth
logic
devices
Sleep transistor
Low Vth
logic
devices
Virtual ground
SL
(VGND)
Figure 2. A DSTN power-gating structure.
Under the power gating structure, a circuit operates in
two different modes. In the active mode, the sleep
transistors are turned on and can be treated as the functional
redundant resistances. In the sleep mode, the sleep
transistors are turned off to reduce the leakage power.
During the sleep period, all internal devices and the virtual
ground are gradually charged because of the leakage in the
low Vth devices. With enough sleep time, we assume that all
internal capacitances of logic devices will be charged and
accumulate to the level close to VDD. 1 Therefore, when
sleep transistors are turned on, a sudden discharge of the
accumulated charges of internal devices leads to a large
current, called a surge current, flowing through the sleep
transistors to ground. The excessive surge current causes
the Ldi/dt, IR drop and electromigration, which greatly
affect the reliability and performance of a circuit [13][21].
In the worst case, the large IR drop may cause short term
VDD collapse, resulting in the states saved in retention
sequential elements or memories to be corrupted [11]. As a
result, constraining the surge current during the power
mode transition is vital for power gating designs. Since the
current flowing through sleep transistors is proportional to
the total size of the turned-on sleep transistors, the number
and the timing to turn on sleep transistors should be
restricted to avoid the excessive surge current. Hence, the
turn-on sequence of the sleep transistors, called the wake-up
schedule, has become a major challenge to constrain the
surge current in a power gating design [7].
INTRODUCTION
As CMOS process advances to nano-meter scale, leakage
increases exponentially and becomes a significant drain on
total power consumption [6][18][19]. Among the leakage
reduction techniques, the power-gating technique has
become one of the most effective methods. Figure 1 shows
a power gating structure where a high Vth sleep transistor is
placed in series to the low Vth devices [16]. Recently, many
industrial power-gating designs have adopted the
Distributed Sleep Transistor Network (DSTN) structure
[14][22]. Figure 2 shows a DSTN structure, in which sleep
transistors are connected by the virtual ground line
(VGND).
VDD
Low Vth
logic
devices
Virtual ground (VGND)
Low Vth
logic
devices
High Vth sleep transistor
SL
Figure 1. A power gating structure.
1
In certain power-gating structures, such as [3], the virtual
ground voltages are controlled to a specific value during the sleep
mode to solve the trade-off between the wake-up penalty and
leakage savings. Hence, it is not always the case that the internal
capacitances are charged to the level close to VDD.
Manuscript received Jun. 2, 2008, revised Sep. 24, 2008.
Da-Cheng Juan, Yu-Ting Chen, Ming-Chao Lee, and Shih-Chieh
Cheng are with the Department of Computer Science, National Tsing
Hua University, HsinChu 30013, Taiwan, R.O.C. (email:
[email protected], [email protected], [email protected],
[email protected])
1
transistors one by one in a one-nanosecond time interval.
Here, a dilemma scenario arises. On one hand, the
maximum surge current of the All-On schedule is 53.2
times larger than that of the One-by-One schedule. On the
other hand, the wake-up time of the All-On schedule is
57.1% shorter than that of the One-by-One schedule.
Therefore, to develop an efficient wake-up schedule, surge
current should be as large as possible to discharge the
internal energy in a short duration such as the All-On
schedule. Nevertheless, the large surge current seriously
affects the reliability of a circuit. As a result, the surge
current should be limited for a reliable wake-up schedule at
the cost of a long wake-up time such as the One-by-One
schedule. Conventionally, this trade-off between the
wake-up time and the surge current has been modeled as a
scheduling
problem
under
a
designer-specified
surge-current constraint. If the upper bound of a surge
current can be obtained, designers can decide how many
sleep transistors to be turned on simultaneously while the
surge current still meets the constraint. Thus, a practical
methodology to estimate the upper bound of a surge current
is required for an effective schedule.
Sizes of sleep transistors
Voltage (v)
V
VGND
SL
Time (ps)
Turned on at
Cycle 1
(a) A weak wake-up signal
Cycle
GND
Cycle 3
(b) One-by-One turn-on sequence per clock
Figure 3. Two methods described in [12].
Several previous works have been developed to provide a
wake-up schedule on the DSTN structure. In [12],
researchers proposed two methods to reduce the maximum
surge current. The first method is to turn on all sleep
transistors stepwise using a weak wake-up signal as shown
in Figure 3(a). The second method is to turn on one sleep
transistor per clock period from the smallest size to the
largest size as shown in Figure 3(b). In [1], researchers
proposed an algorithm to partition logic devices to form a
cluster-based power-gating design [4] for eliminating
short-circuit problems. They also provided a wake-up
schedule for the cluster-based structure [1][4]. In [17], their
work is based on the fine-grained power-gating structure,
i.e. each gate is connected its own sleep transistor. They
formulated the wake-up scheduling problem as a Mixed
Integer Linear Problem (MILP) and proposed an algorithm
successively relaxing the MILP to a computationally
efficient LP. In [20], the researchers propose a two-pass
turn-on mechanism to handle surge currents and IR drops in
a power-gating SoC design.
VA
VVGND
Figure 5. Spurious glitches of an industrial design during the power mode
transition.
We now describe two important observations during the
power mode transition. All experiments are implemented in
TSMC 90nm CMOS technology. We illustrate the first
observation in Figure 4. Let us consider the waveforms of
the surge currents under two different wake-up schedules.
Before the 4000th nanosecond, the circuit is in the sleep
mode. The dotted line stands for the surge current under an
All-On schedule that turns on all transistors simultaneously
at the 4000th nanosecond. The bold line presents the surge
current under a One-by-One schedule that turns on sleep
247 mA
The other important observation is that spurious glitches
are the major factors in total energy loss during the power
mode transition. Figure 5 shows the voltage waveforms of
an industrial design. The bold line stands for the voltage of
an internal node A (VA) whereas the dotted line presents the
voltage of VGND (VVGND). Initially, node A and VGND are
charged close to VDD in the sleep mode. During the power
mode transition, VGND needs sufficient time to stabilize to
ground. Meanwhile, several spurious glitches occur before
node A reaches its final value. In this case, it takes 250ns
for VGND to become stable. Many spurious glitches can
occur during the power mode transition. According to our
experiments, on average, 75.71% of the total discharging
energy comes from spurious glitches that greatly increase
the wake-up time.
One-by-One Schedule
All-On Schedule
4.64 mA
In this paper, we analyze the behaviors of the surge
current in DSTN designs. Based on the analyses, we devise
Figure 4. Two surge-current distributions under two wake-up
schedules: All-On and One-by-One.
2
connected to the corresponding STi on VGND. We would
like to point out that V(STi) may vary from one to another,
especially in a large module, due to the resistance and
capacitance of VGND. To simplify the model, in the DC
analysis, a power-gating module can be transformed into a
resistance network. By solving the resistance network, we
can estimate the current flowing through turned-on STi in
the active mode but not during the mode transition [8][15].
a wake-up schedule that significantly reduces the wake-up
time. To deal with spurious glitches, we also propose an
improved schedule that reduces the total energy loss and
further improves the wake-up time. In comparison to [12],
on average, our method achieves 261 times reduction in the
wake-up time and reduces energy loss from spurious
glitches by 30%.
The remainder of this paper is organized as follows.
Section II presents the preliminaries of the DSTN structure
and of the surge current. In Section III, we perform a surge
current analysis and introduce our problem formulation. In
Section IV, we propose two effective wake-up schedules for
the DSTN designs. Section V and Section VI present our
implementation flow and experimental results respectively.
Finally, Section VII concludes the paper.
II.
In a system point of view, the magnitude of a surge
current is an essential reference to examine the severity of
how the ground rail (GND) is influenced, when one or more
of the modules are during the power mode transition. In a
DSTN module, a surge current that occurs during the power
mode transition is treated as the summation of the total
current flowing through each turned-on sleep transistor to
GND. For simplicity without losing generality, surge
current, which originally should be a function of time, is
expressed as Eq(1) with parameter t hidden:
PRELIMINARIES
Module A
Active
Module B
Sleep to Active
Module C
Sleep
surge_current = ∑ Iturnon (STi)
Module D
Active
Module E
Sleep
Module F
Active
where Iturnon(STi) stands for the current flowing through the
turned-on STi during the power mode transition. Normally,
once a sleep transistor is turned on, it will not be turned off
again during the power mode transition.
GND
∀i
Eq(1)
Figure 6(a). The overview of a module-based System on Chip.
C1
C2
V(ST2)
V(ST1)
Iturnon(ST1)
ST1
C3
Iturnon(ST2)
ST2
Iturnon(ST3)
III.
VDD
A.
SURGE CURRENT ANALYSIS
Surge Current versus Wake-up Time
To begin with, the wake-up time is the time required
when (1) all sleep transistors are turned-on, and (2) all
V(STi)s are stable within ±5% of nominal. Note that the
wake-up time begins to be counted after any sleep transistor
is turned on, called Tbegin. According to [1][12], the wake-up
time is formally defined as:
V(ST3)
VGND
ST3
Surge current
Figure 6(b). Current distribution of a DSTN module.
wake- up time= max{Tstable(STi ) , Tturnon(STi )} − Tbegin ∀i
In this section, we introduce the features of the DSTN
structure and the calculation of the surge current. We
assume a design contains several modules as shown in
Figure 6(a) and one or several modules may go into sleep
and wake up at the same time. Instead of the whole design,
we focus on one DSTN module which is in the power mode
transition as in Figure 6(b). Each cluster is connected to the
corresponding sleep transistor STi and to other sleep
transistors by the virtual ground (VGND). Traditionally,
logic clusters have been modeled as current sources
whereas both sleep transistors and each segment of VGND
have been modeled as resistors in the active mode. In
Figure 6(b), V(STi) stands for the voltage of the node
Eq(2)
where Tstable(STi) is the time when V(STi) is stable within
±5% of nominal; Tturnon(STi) stands for the time when STi is
Wake-up time of A
Wake-up time of B
Surge-current constraint = 79 mA
Schedule A
Schedule B
Figure 7. Wake-up time comparison under two different schedules.
3
current is the summation of Iturnon(STi) at a given time.
Hence, let us focus on the calculation of Iturnon(STi) during
the saturation region and the linear region. The equations
are shown as Eq(3) and Eq(4) respectively.
turned on.
Now we discuss how a surge current influences the
wake-up time. The magnitude of a surge current determines
the speed of discharging the accumulated energy. Hence, a
short wake-up time can be achieved with a large surge
current. Nevertheless, since an excessive surge current is
the major source of noise on the power distribution network,
the surge current should be under a designer-specified
constraint to maintain the reliability of a circuit. Take
Figure 7 as an example. The bold line presents the surge
current waveform under schedule A while the dotted line
stands for schedule B. We assume that the surge-current
constraint is 79mA marked as the horizontal dotted line.
Both schedule A and schedule B satisfy the constraint.
However, compared to schedule A, schedule B is relatively
pessimistic. The wake-up time of schedule B is 10.8 times
longer than that of schedule A. As a result, the upper bound
of a surge current is needed to develop an efficient wake-up
schedule that operates under a given constraint.
B.
I turnon (STi ) = k n
I turnon (STi ) = k n
W (STi )
(VGS − VTH )2 (1 + λV (STi ))
L
W (STi ) ⎡
(VGS − VTH )V (STi ) − 1 V (STi )2 ⎤⎥
2
L ⎢⎣
⎦
Eq(3)
Eq(4)
where W(STi) is the width of STi, kn is the process
transconductance, L is the channel length, VGS is the
gate-source voltage, VTH is the threshold voltage, and λ is
the channel-length modulation parameter. Under a given set
of W(STi), the magnitude of Iturnon(STi) depends on V(STi)
which equals to the potential difference across STi.
Problem Formulation
Inputs:
(1) W(STi) for all i (2) SC_CONSTRAINT
(3) A sleep vector
Initial Condition:
STi is turned off for all i
Decision variable:
Tturnon(STi) for all i
Figure 9. Two segments of the virtual ground of an industrial design.
Objective function:
We now describe two key characteristics of V(STi). First,
V(STi) is time-varying and thus difficult to calculate
analytically. The main difficulty is that each time spurious
glitches occur, additional charges flow into parts of the
internal nodes. Meanwhile, the accumulated energy is still
discharged through sleep transistors. The second
characteristic is that, empirically, V(STi) is strictly
decreasing in DSTN designs. Figure 9 shows this
characteristic by two waveforms: V(ST1) and V(ST2). No
bounce happened on either V(ST1) or V(ST2) during the
power mode transition. Moreover, we have simulated a
large quantity of benchmarks under different wake-up
schedules, finding that the empirical property holds in all
DSTN designs. The reason for this consistency, according
to our experiment, is that the discharge rate through sleep
transistors is always larger than the charge rate from
spurious glitches. However, we would like to mention that
due to RLC parasitic from power networks and from the
package, the monotonic decreasing of virtual grounds might
not be true. In other words, when the RLC effects of the
package are taken into consideration, large surge current
may lead to longer wakeup time. Still, in the case when
there are only one or few small modules in the power mode
transition and the surge current is limited to some value, we
Minimize max(Tstable(STi), Tturnon(STi)), for all i
Subject to: SC_CONSTRAINT ≧ surge_current,
where surge_current = ΣIturnon(STi) for all i
Figure 8. Wake-up schedule problem formulation.
Our problem formulation is shown in Figure 8. First, the
following three inputs are given: (1) a set of practical sizes
for STi, W(STi) [8][9][14], (2) the surge-current constraint
(SC_CONSTRAINT), and (3) a sleep vector [2][21]. Note
that the sleep vector is an input vector applied to the
primary inputs of a circuit before entering the sleep mode.
The values of sleep vector are known and will remain
unchanged during both the sleep period and the wake-up
process [1].
Next, the initial condition is that all STis are turned off.
The decision variable is Tturnon(STi). Again, Tturnon(STi)
stands for the time when STi is turned on. The objective
function is to minimize the wake-up time defined in Eq(2).
Finally, the surge current at all times must satisfy
SC_CONSTRAINT.
C. Surge Current versus Virtual Ground
In this section, we present the key factors affecting the
estimation of a surge current. According to Eq(1), the surge
4
of VST1(t≥30). We emphasize that the strictly decreasing
characteristic of V(STi) on the DSTN structure is important
because it can be applied to estimate the upper bound of a
surge current. As a result, Iturnon(ST1, t=30) is also the
largest value of Iturnon(ST1, t≥30) according to Eq(3)(4).
Hence, SC_CONSTRAINT remains satisfied before we turn
on the next STis. In our technique, if any STi still remains
turned-off, we will iteratively check and might turn on
several STis after a certain time interval. The time interval is
empirically decided. In our experiment, we set the time
interval to 30ps. Hence, the next time interval is between t
= 30 and t = 60. Since ST2 remains turned off, we will
simulate the design with ST1 turned on between t = 30 and t
= 60. All V(STi)s are obtained through simulations because,
as mentioned in Section III.C, V(STi)s are difficult to
calculate analytically. Figure 10(b) shows the second
iteration of our technique. After the SPICE-like simulator
updates the waveforms of V(STi), we identify that Iturnon(ST1,
t=60) + Iturnon(ST2, t=60) = 90mA, which does not exceed
SC_CONSTRAINT. Therefore, we can turn on ST2 at t=60
and have Tturnon(ST2) = 60. However, VST1(t=60) and
VST2(t=60) are not yet stable in this time interval. At the end
of the second iteration, we continue the simulation with
both ST1 and ST2 turned on. Figure 10(c) shows that
VST1(t=90) and VST2(t=90) are stable within ±5% of nominal.
Therefore, Tstable(ST1) and Tstable(ST2), defined in Section
III.A, are set to 90. As a result, since Tbegin = 30, the
wake-up time is 90ps – 30ps = 60ps according to Eq(2).
think the inductance effect can be very marginal. In Section
VI.C, we have a further discussion about the package RLC
parastics.
IV.
WAKE-UP SCHEDULES FOR WAKE-UP TIME
MINIMIZATION CONSIDERING SPURIOUS GLITCHES
A.
An Efficient Wake-up Schedule for Wake-Up Time
Minimization
Voltage (v)
I(ST2, 30) = 60 mA
1
V(ST1)
V(ST2)
I(ST1, 30) = 90 mA
ST1, ST2 are turned off.
SC_CONSTRAINT = 100 mA
30
ps
(a) The first iteration
Voltage (v)
I(ST2, 60) = 55 mA
1
I(ST1, 60) = 35 mA
ST1 is on and ST2 is off.
30
SC_CONSTRAINT = 100 mA
ps
60
(b) The second iteration
Voltage (v)
Sleep
Wake-up process
Active mode
1
ST1, ST2 are turned on.
V(ST1),V(ST2) are stable.
0
30
60
90
ps
(c) The third iteration
Figure 10. Wake-up schedule example.
Figure 11 presents the details of our schedule algorithm
for Wake-up Time Minimization (WTM). Initially, STis are
Without losing generality, we illustrate our method using
an example in Figure 10. To begin with, since V(STi) varies
over time, we need to expand V(STi) into VSTi(t=t0), which
stands for the value of V(STi) at t = t0. Similarly, we expand
Iturnon(STi) into Iturnon(STi, t=t0). We assume that the
surge-current constraint, SC_CONSTRAINT, is set to
100mA and W(STi)s are given. Initially, before t = 30, both
ST1 and ST2 are turned off. Again, we assume that VST1(t=30)
and VST2(t=30) have been charged to VDD. Let the wake-up
process begin at t = 30, i.e. Tbegin = 30. According to
Eq(3)(4), we can calculate both Iturnon(ST1, t=30) and
Iturnon(ST2, t=30). Then we can determine which STis can be
turned on while SC_CONSTRAINT is still satisfied. In
Figure 10(a), we have Iturnon(ST1, t=30) = 90mA and
Iturnon(ST2, t=30) = 60mA. To satisfy SC_CONSTRAINT, we
can turn on either ST1 or ST2 but not both, at t = 30. Let us
choose to turn on ST1 because Iturnon(ST1, t=30) is larger
than Iturnon(ST2, t=30) and may lead to a shorter wake-up
time. Therefore, we have Tturnon(ST1) = 30. Moreover, since
V(ST1) is strictly decreasing, VST1(t=30) is the largest value
Algorithm :
Wake-Up Schedule (W(STi), SC_CONSTRAINT)
1: Output: A set of decision variables Tturnon(STi)
2: for i ← 1 to NUM_ST do
3:
STi ← OFF;
4: end for
5: t ← Tbegin;
surge_current ← 0;
6: repeat
7:
update VSTi(t) for all i;
8:
9:
10:
11:
12:
13:
14:
15:
/* step 1: initialization */
/* step 2: scheduling */
update Iturnon(STi, t) for all i;
Apply the dynamic programming technique on OFF STis
For each OFF STi which can be turned on do
STi ← ON;
Tturnon(STi) ← t;
update surge_current ← ∑ Iturnon(STi, t) for all ON STi;
end for
/* Simulate for another interval */
do simulation; t ← t + time_interval;
16: until STi is ON all i
17: return Tturnon(STi) for all i;
Figure 11. Schedule for Wake-up Time Minimization (WTM) algorithm.
5
set to OFF, surge_current set to 0, and t is set to Tbegin. In
the beginning of the iteration, we update VSTi(t) according to
the simulation results. With Eq(3)(4), Iturnon(STi, t) can be
calculated under a given set of W(STi). For those STis which
remain turned off, we aim to turn on several additional STis
to have surge_current as large as possible without
exceeding SC_CONSTRAINT. Note that this problem can
be transformed into a well-known knapsack problem, which
can be solved efficiently by the dynamic programming
technique [10]. Hence, we apply the dynamic programming
technique to decide which STis should be turned on.
According to the obtained result, we turn on STi and have
Tturnon(STi) set to t. Meanwhile, we also need to update
surge_current. At the end of this iteration, we perform the
simulation with all turned-on STis and update t with t +
time_interval. Again, time_interval is empirically
determined. After all STis are turned on, our algorithm
returns the wake-up schedule, Tturnon(STi)s.
Topological order: 1
V(ST1)
ST1
Sleep
Signal
ST3
ST4
ST5
N3
V(ST3)
V(ST2)
ST3
ST2
(a)
Schedule A: T t u r n o n (ST 1 )=2,
Schedule B: T t u r n o n (ST 1 )=0,
T t u r n o n (ST 2 )=0,T t u r n o n (ST 3 )=4
T t u r n o n (ST 2 )=2,T t u r n o n (ST 3 )=4
Voltage (V)
1
Voltage (V)
VN1
1
VN1
V(ST1)
0
1 2 3 4 5 6 7
Time unit
(b)
Voltage (V)
1
Large glitch
VN2
V(ST1)
1 2 3 4 5 6 7
(e)
Voltage (V) Small glitch
0
1
VN2
V(ST2)
Row 1
V(ST2)
1 2 3 4 5
(f)
Voltage (V)
1 2 3 4 5 6 7
(c)
Additional
Voltage (V)
0
6 7
1
1
VN3
V(ST3)
0
i. Physical Implementation Issues
ST2
3
N2
PI
B.
An Improved Wake-up Schedule Considering
Physical Implementation Issues and Spurious Glitches
ST1
2
N1
VN3
V(ST3)
Row 2
Row 3
0 1 2
3 4 5 6 7
0
(d)
Figure 13. Our wake-up schedule example.
Row 4
Row 5
1 2 3 4 5 6 7
(g)
transistors can be turned on at the same time. We assume
that a wake-up sequence first turns on sleep transistors
between STa and STb where a ≦ b ≦ k. After that, only
consecutive STis whose positions are next to STa or STb can
be turned on. Figure 12 shows a turn-on sequence {(ST4),
(ST5, ST3), (ST2, ST1)} following the above formulation. In
the example, ST4 is turned on first followed by ST5 and ST3
together. Finally, ST2 and ST1 are turned on at the same
time.
Figure 12. A turn-on sequence considering the physical placement of sleep
transistors.
Physically, sleep transistors are normally placed in order
[15][22] and, in most cases, are deployed at the ends of
rows. Therefore, if there are five rows, there will be five
sleep transistors aligned in order as the example in Figure
12. Typically, a sleep/wake-up signal is provided from a
power management unit. Turning on sleep transistors
without conforming to their order of physical placement
may lead to a large routing area due to the complicated
power management units. The penalty may make the
wake-up schedule impractical. For example, the turn-on
sequence {ST1, ST3, ST4, ST2, ST5} does not conform to the
physical order of {ST1, ST2, ST3, ST4, ST5}.
ii. Spurious Glitches Phenomenon
Spurious glitches waste energy during the mode
transition. Another objective of this paper is to minimize
spurious glitches by controlling the wake-up schedule. In
the following paragraphs, two useful properties are
described to reduce spurious glitches.
We now illustrate how different schedules affect the
seriousness of spurious glitches. Figure 13(a) shows three
cascaded inverters located in three different logic clusters in
a DSTN design. For simplicity of explanation, the
environment is set up as follows. The supply voltage is 1V,
As a result, we assume sleep/wake-up signals are
arranged as a daisy-chain-like implementation [11]. In our
wake-up schedule, we restrict the turn-on sequence of sleep
transistors as follows. First, in our formulation, sleep
transistors are numbered from ST1 to STk. Several sleep
6
the delay of an inverter is 1 time unit, and the input of the
first inverter is logic 1 during the wake-up process. We
assume that schedule A is {Tturnon(ST1) = 2, Tturnon(ST2) = 0,
Tturnon(ST3) = 4} and schedule B is {Tturnon(ST1) = 0,
Tturnon(ST2) = 4, Tturnon(ST3) = 2}. Figure 13(b)(c)(d) show
the voltage waveforms of V(STi)s and of three inverters’
outputs, VNi, under schedule A. Figure 13(e)(f)(g) show the
waveforms under schedule B. From these figures, schedule
B is superior to schedule A in reducing spurious glitches.
The results are as follows. First, schedule A has a spurious
glitch on VN3 in Figure 13(d) while schedule B does not in
Figure 13(g). Moreover, in Figure 13(c)(f), schedule A has a
larger glitch on VN2 than schedule B.
iii. An Improved Wake-up Schedule Considering Spurious
Glitches and Physical Implementation Issues
Let us now explain why schedule B outperforms schedule
A on spurious glitch reduction. The output of a logic gate
remains unstable if its inputs are unstable. As a result, the
fan-in cones of a gate should be stabilized earlier than the
gate itself. From the above statement, if a gate is
topologically close to the primary inputs, it should be
stabilized earlier to avoid the propagation of spurious
glitches. In Figure 13(a), N1 is closer to the primary inputs
than N2 and N3. The spurious glitches may be reduced if the
voltage of N1, VN1, is stabilized earlier than VN2 and VN3.
Moreover, since each logic gate is connected to the
corresponding STi, the output value of the gate, if equal to
logic 0, is bounded by V(STi). For example, since N1 is
connected to ST1, VN1 will be stabilized after V(ST1)
becomes stable as shown in Figure 13(b)(e). Furthermore,
V(STi) can be stabilized faster by turning on its
corresponding sleep transistor, STi, due to the current
discharging balance on the DSTN structure [8][14]. In this
example, since schedule B turns on ST1 at t=0 whereas
schedule A turns on ST1 at t=2, V(ST1) is stabilized earlier in
schedule B than in schedule A. Therefore, VN1 is stabilized
earlier in schedule B. Since VN1 is stable, VN2, the output of
VN1, is also stabilized earlier in schedule B than in schedule
A. From the above example, the first property to reduce
spurious glitches is that the logic gates should be stabilized
in their corresponding topological order.
Let us explain the differences between IWTM and WTM
mentioned in Section IV.A. IWTM calculates
composit_weight(STi) as the turn-on priority of the sleep
transistors. composit_weight(STi) is calculated as:
composit_weight(STi) = topological_weight(STi) +
position_weight(STi) + width_weight(STi)
where topological_weight(STi), position_weight(STi) and
width_weight(STi) represent three major parts which impact
the wake-up time and the energy loss of a schedule. First,
let the level of a logic gate represents its topological order
from primary inputs to the gate similar to the definition in
[5]. After that, level of each gate is calculated and then
topological_weight(STi) can be expressed as:
An improved schedule for wake-up time minimization
has to consider both spurious glitches and physical
implementation issues. From Section IV.B.ii, we observe
that the sleep transistor with the following two properties
should have higher turn-on priority – (1) connected to more
logic gates close to primary inputs, and (2) connected to
more gates whose outputs are logic 0. Nevertheless, the
gates with the same topological order may not be placed in
the same cluster physically in practice. In this section, we
propose an Improved schedule for Wake-up Time
Minimization (IWTM).
topological_weight(STi) = Σ(1 / level of gate j)
where gate j represents all gates whose final outputs are
logic 0 for each i. topological_weight(STi) represents the
turn-on sequence which potentially makes the majority of
gates stabilized in the topological order.
Second, position_weight(STi) stands for the impact factor
arising from the physical position of a sleep transistor.
Under the physical limitation mentioned in Section VI.B.i,
the starting point to turn on sleep transistor is important
because it limits the choices of sleep transistors to be turned
on afterward. Hence, the middle position is preferred since
the set of turn-on choices may be larger and the V(STi)s
may decrease averagely because of the current discharging
balance [14]. We assume that the positions of sleep
transistors are numbered from 1 to k. position_weight(STi)
is calculated as:
The second property is that STi, which is connected to
more gates whose final output values are logic 0, should
have higher priority to be turned on. We illustrate this
concept using the same example in Figure 13. Let us first
focus on VN2 whose final value is 1. Under the assumption
that VN1 settles down to logic 0, the stabilization of VN2 is
independent from V(ST2) since VN2 is charged by VDD, not
discharged by V(ST2).
position_weight(STi)=middle_pos-|pos(STi)-middle_pos |
where middle_pos = (1 + k) / 2 and pos(STi) is the position
number of STi.
Finally, width_weight(STi) is the third index to decide the
turn-on priority. Since the dynamic programming technique
7
the gates in the same row are grouped into a cluster. Each
sleep transistor is placed at the end of corresponding cluster
shown in Figure 12. For simplicity of discussion, we use
one-footer power-gating structure to demonstrate our idea.
In practice, normally, two footers are adopted along both
ends of a row. We implement [21] as the sleep vector file
and [9] as the sleep transistor size file. Note that the TSMC
90nm CMOS technology process is applied throughout all
experiments. According to [23], the current flowing through
a sleep transistor cannot be accurately calculated with
Eq(3)(4) under 90nm process or below. Hence, to replace
Eq(3)(4), we use “footer sleep transistor behavior file” for
the current estimation. Note that “footer sleep transistor
behavior file” is generated via HSPICETM simulation under
different W(STi)s and V(STi)s. Next, to construct “virtual
ground behavior file” which contains all V(STi)s at a
specified time point, the design is simulated through
NanosimTM with the accuracy set to 6. With updated V(STi)s
and the given W(STi)s of all sleep transistors, we can obtain
Iturnon(STi)s using the “footer sleep transistor behavior file”.
Finally, we iteratively simulate the design via NanoSimTM
and then feed “virtual ground behavior file” back to our
IWTM Engine until all sleep transistors are turned on.
is no longer suitable to be applied because of the limited
turn-on choices and of the considerations to reduce spurious
glitches, width_weight(STi) aims to turn on a sleep
transistor with a larger width first, which leads to a larger
Iturnon(STi). width_weight(STi) is expressed as:
width_weight(STi) = W(STi) / width_avg
where W(STi) is the width of STi and width_avg is the
average width of all sleep transistors. According to
composit_weight(STi), we prioritize STis in a decreasing
order. In the first iteration, under the surge-current
constraint, we manage to turn on the highest priority sleep
transistor. Next, after updating surge_current and V(STi)s,
we aim to turn on several high priority STis if the
summation of Iturnon(STi, t) and surge_current is smaller
than or equal to SC_CONSTRAINT. Then, our iterative
technique begins to accomplish the turn-on schedule. Note
that the prioritized STis can be turned on only when they
have turned-on neighbors and satisfy SC_CONSTRAINT at
the same time.
V.
IMPLEMENTATION FLOW
RTL Netlist
: Existing tools
VI.
: Our tools
Synthesis
A.
The Comparisons among All-On, Kim’s Method,
WTM and IWTM
Here, we re-implemented the second method of [12] and
compared that with our WTM and IWTM on the DSTN
structure. Again, WTM presented in Section IV.A does not
consider spurious glitches and physical implementation
issues while IWTM presented in Section IV.B.iii takes both
into consideration. In this section, the package parasitics are
not added for both our methods and [12].
Gate-level Netlist
Capacitance file
Placement
DEF file
Surge-current
constraint file
Sleep
Vector
Sleep
Transistors
Size file
Footer ST
Behavior file
IWTM
Engine
DSTN Design
Generator
Wake-up
Schedule file
Virtual Ground
Behavior file
NanoSimTM
EXPERIMENTAL RESULTS
Next, the parameters of our experiments are set as
follows. First, the time interval is set to 30ps in WTM and
IWTM for simulation. In our experimental experience, the
time interval of 30ps is a good balance between program
runtime and wake-up time improvement. Secondly, since a
well-designed circuit normally can tolerate the maximum
instantaneous current (MIC) arising from gate switching in
the active mode, the MIC is used as the surge-current
constraint to guarantee the reliability of the circuit. We use
PrimePowerTM to generate the MIC as the surge-current
constraint. Note that practically designers need to
guardband the surge-current constraint to prevent the
worst-case surge current from exceeding the constraint,
which may damage the reliability of a circuit. Finally, to
re-implement the method of [12], the clock period of each
Design file
Wake-up
time
Figure 14. Implementation flow.
Here, we present the implementation flow of our sleep
transistor wake-up schedule in Figure 14. First, an RTL
netlist is synthesized by Synopsys DesignVisionTM to
generate a gate-level netlist. After that, the gate-level netlist
is placed and routed by Cadence SOC EncounterTM. After
placement and routing, we can extract the parasitic
capacitances by SOC EncounterTM. We also obtain the DEF
file containing the physical location of each gate. Note that
8
TABLE I: THE COMPARISONS AMONG WAKE-UP TIME, ENERGY LOSS AND SURGE CURRENT.
Wake up Time (ns)
Energy Loss (pico-coulomb)
Max Surge Current (mA)
Gate
Longest
File Name
Count delay (ns) All-On
[12] WTM IWTM All-On
[12]
WTM IWTM All-On [12] WTM IWTM Constraint
AES
46821
14.22
0.07 2388.96 4.08
3.71 206.35 1301.33 355.77 314.84 9812.51 6.00 635.84 605.56
640.58
C6288
4061
8.48
0.05 398.56 0.90
0.78 15.07
42.49 27.39 24.29 611.39 2.73 35.89 36.76
37.62
des
3631
5.33
0.01 239.85 0.45
0.21 16.22
31.33 23.91 21.32 2594.76 8.02 182.17 188.71
195.08
C7552
3495
4.52
0.04 185.32 0.51
0.33 12.82
27.81 18.41 16.25 898.64 1.57 65.40 65.93
68.06
C5315
2725
2.89
0.05 104.04 0.57
0.51 10.14
15.43 13.97 13.19 383.96 2.90 32.44 32.40
34.23
i10
2612
5.80
0.08 208.80 0.42
0.36 10.17
20.41 12.38 11.81 239.16 1.24 37.01 37.38
39.44
pair
1865
2.35
0.06
70.50 0.30
0.18
7.33
10.80
8.87
7.87 225.84 3.00 53.67 53.51
55.94
C3540
1790
3.66
0.06 109.80 0.57
0.51
7.68
14.47 10.02
9.65 277.13 1.33 22.08 23.03
23.95
C2670
1034
2.70
0.03
54.00 0.30
0.24
3.70
4.59
4.61
4.36 226.95 1.71 25.07 23.99
26.35
i8
1033
2.06
0.03
45.32 0.30
0.24
4.36
8.33
5.08
4.91 353.81 2.41 27.12 26.92
28.67
apex6
915
1.51
0.03
30.20 0.30
0.21
3.54
4.60
4.60
4.26 271.76 4.50 28.88 27.97
30.09
rot
888
2.79
0.06
53.01 0.21
0.18
3.39
5.34
3.87
3.64 109.84 0.93 23.80 23.05
24.80
C880
873
2.20
0.05
44.00 0.30
0.27
3.58
6.12
4.49
4.35 138.17 1.21 19.20 20.03
20.79
i7
846
1.53
0.01
27.54 0.21
0.12
3.80
5.30
4.70
4.41 772.96 6.64 70.19 61.15
72.54
C1908
800
3.03
0.05
57.57 0.39
0.39
3.39
5.82
4.36
4.25 147.80 0.93 14.13 13.65
15.02
dalu
792
2.57
0.04
51.40 0.36
0.36
3.67
6.43
4.61
4.45 156.77 1.08 15.88 16.69
17.51
C1355
788
2.50
0.03
47.50 0.36
0.27
3.39
5.94
4.41
4.13 203.97 1.04 18.36 19.08
19.83
C499
760
2.49
0.03
44.82 0.36
0.30
3.13
4.73
3.96
3.82 177.73 1.69 15.56 15.76
16.91
i9
671
1.82
0.01
32.76 0.18
0.12
3.87
6.02
4.45
4.26 766.57 4.28 72.51 65.66
74.26
C432
510
2.76
0.04
41.40 0.40
0.15
2.19
3.16
2.63
2.34
93.08 0.88 18.85 19.35
20.21
x1
402
1.40
0.05
18.20 0.21
0.21
1.81
2.63
2.07
1.97
62.38 0.93 13.64 12.65
14.53
i4
384
1.43
0.02
14.30 0.12
0.09
1.24
1.34
1.43
1.36 100.49 3.95 24.35 21.35
25.38
alu2
350
3.23
0.06
45.22 0.36
0.36
2.07
3.69
2.66
2.54
62.73 0.86 8.42 8.20
9.43
i5
331
3.15
0.02
25.20 0.09
0.09
0.87
1.05
1.05
1.02
79.54 1.78 19.65 16.93
20.72
apex7
314
1.72
0.05
18.92 0.18
0.18
1.45
2.04
1.65
1.57
48.81 0.88 11.91 11.14
12.85
my_adder
304
4.35
0.02
47.85 0.18
0.18
1.52
2.07
1.84
1.78 143.07 0.50 16.25 14.43
16.92
term1
278
1.44
0.04
15.84 0.24
0.27
1.41
2.06
1.72
1.62
61.17 0.91 9.49 8.88
10.13
ttt2
245
1.38
0.03
13.80 0.24
0.30
1.26
1.84
1.51
1.47
64.51 0.90 8.23 7.84
9.60
go
76
1.20
0.04
4.80 0.15
0.15
0.41
0.47
0.46
0.43
16.23 0.82 3.74 3.70
4.50
pm1
74
1.21
0.05
4.84 0.18
0.18
0.40
0.52
0.47
0.46
13.95 0.50 3.26 3.44
3.84
Average 2655.6
3.19
0.16 261.49 1.26
1.00
0.84
1.43
1.06
1.00
8.27 0.07 0.93 0.91
1.00
benchmark circuit is set to the corresponding longest path
delay.
benchmarks such as C6288 and an industrial Advanced
Encryption System (AES). In AES, the wake-up time of our
IWTM algorithm is 643 times faster than that of [12] and
has 75.8% less energy loss than [12]. It is worth mentioning
that the runtime of our flow can finish within one hour on
all ISCAS benchmarks and within eight hours on AES. In
addition, IWTM outperforms WTM in the aspects of
wake-up time, energy loss, and maximum surge current on
average cases. The reason is that spurious glitches are taken
into account to reduce additional energy loss, which
contributes the wake-up time minimization. Furthermore,
IWTM inherits the main idea of WTM to have the surge
current as large as possible without exceeding the constraint.
Therefore, even if the physical implementation issue
constrains the solution space of IWTM, IWTM still
averagely performs better than WTM. The results clearly
demonstrate that our IWTM always achieves impressive
wake-up time and energy loss reduction in both benchmarks
and the industrial design.
Table I shows our experimental results. Take circuit
C1355 as an example. The wakeup time is 0.03ns from
All-On, 47.50ns from [12], 0.36ns from WTM, and 0.27ns
from IWTM. The energy loss from spurious glitches during
the power mode transition is 3.39pico-coulomb from All-On,
5.94pico-coulomb from [12], 4.41pico-coulomb from WTM,
and 4.13pico-coulomb from IWTM. The surge current is
203.97mA from All-On, which violates the surge-current
constraint, whereas the other methods satisfy the constraint
successfully.
On average, the wake-up time of our IWTM is 261 times
faster than that of [12]2. Also, IWTM has 30% less energy
loss than [12]. Our results are significant on large
2
The averages in the bottom line of Table I are computed as
follows. First, for a benchmark, the results from each method are
normalized to those from IWTM, e.g. Column 4 to 7 are
normalized to Column 7. Then, for each column, the normalized
values are averaged.
9
B.
The Experimental Results of Kim’s Method [12] in
30ps Time Interval
experimental results. Under IWTM scheduling, the wake-up
time is 10.54ns, the energy loss is 20.51pico-coulomb, and
the maximum surge current is 64.3mA, which satisfies the
surge-current constraint. Compared to the results without
packages, the wake-up time of C7552 increases from 0.33ns
(in Table I of Section VI.A) to 10.54ns, due to the
inductively induced voltage fluctuation of VGNDs.
TABLE II: RESULTS OF KIM’S METHOD [12] IN 30PS TIME INTERVAL.
File
Name
AES
C6288
des
C7552
C5315
i10
pair
C3540
C2670
i8
apex6
rot
C880
i7
C1908
dalu
C1355
C499
i9
C432
x1
i4
alu2
i5
apex7
Longest Wake-up
Energy Loss
Max Surge Current
Constraint Constraint
path
Time (ns)
(pico-coulomb)
(mA)
(mA)
Satisfied
delay
IWTM
IWTM
IWTM
[12]
[12]
[12]
(ns)
14.22
8.48
5.33
4.52
2.89
5.8
2.35
3.66
2.7
2.06
1.51
2.79
2.2
1.53
3.03
2.57
2.5
2.49
1.82
2.76
1.4
1.43
3.23
3.15
1.72
my_adder
4.35
term1
1.44
ttt2
1.38
go
1.2
pm1
1.21
Average 3.19
5.04
1.41
1.35
1.23
1.08
1.08
0.9
0.9
0.6
0.66
0.6
0.57
0.6
0.54
0.57
0.6
0.57
0.54
0.54
0.45
0.39
0.3
0.42
0.24
0.33
0.33
0.33
0.3
0.13
0.14
2.48
3.71
0.78
0.21
0.33
0.51
0.36
0.18
0.51
0.24
0.24
0.21
0.18
0.27
0.12
0.39
0.36
0.27
0.3
0.12
0.15
0.21
0.09
0.36
0.09
0.18
0.18
0.27
0.3
0.15
0.18
1.00
363.72
32.54
27.84
20.06
15.4
13.33
9.56
11
4.78
5.8
4.88
4.05
4.64
5.26
4.38
4.97
4.64
4.17
4.69
2.54
2.06
1.44
2.44
1.07
1.6
1.85
1.66
1.52
0.41
0.4
1.10
314.84
24.29
21.32
16.25
13.19
11.81
7.87
9.65
4.36
4.91
4.26
3.64
4.35
4.41
4.25
4.45
4.13
3.82
4.26
2.34
1.97
1.36
2.54
1.02
1.57
1.78
1.62
1.47
0.43
0.46
1.00
273.78
62.65
110.91
63.61
41.88
36.23
36.25
37.15
28.41
31.66
32.27
22.89
20.22
52.69
23
25.87
28.89
23.57
53.56
17
17.45
17.55
15.57
15.96
13.26
23.21
14.5
14.53
7.3
6.73
1.26
605.56 640.58
36.76 37.62
188.71 195.08
65.93 68.06
32.4 34.23
37.38 39.44
53.51 55.94
23.03 23.95
23.99 26.35
26.92 28.67
27.97 30.09
23.05
24.8
20.03 20.79
61.15 72.54
13.65 15.02
16.69 17.51
19.08 19.83
15.76 16.91
65.66 74.26
19.35 20.21
12.65 14.53
21.35 25.38
8.2
9.43
16.93 20.72
11.14 12.85
14.43 16.92
8.88 10.13
7.84
9.6
3.7
4.5
3.44
3.84
1.00 52.99
YES
NO
YES
YES
NO
YES
YES
NO
NO
NO
NO
YES
YES
YES
NO
NO
NO
NO
YES
YES
NO
YES
NO
YES
NO
NO
NO
NO
NO
NO
Next, with the same package model, we re-implement [12]
and set the time interval to one clock period. The wake-up
time is 185.32ns, the energy loss is 27.91pico-coulomb, and
the maximum surge current is 1.57mA. Compared to [12],
IWTM achieves 17.58 times wake-up time reduction when
the package is applied. The results demonstrate that, even
with package, IWTM still performs well on the wake-up
time minimization.
Here, we would like to point out that simulating a design
with the package RLC parasitics is extremely
time-consuming. With the accuracy of NanosimTM set to 6,
the simulation of the industrial design AES lasts seven days
and cannot conclude. As a result, it may not be practical to
simulate an entire SoC design with packages, especially
under the high accuracy setting. In the early stage of the
design process, the surge-current constraint is usually
applied as a popular alternative to ensure the reliability of a
design during the power mode transitions [1][17]. In this
paper, we also assume that a properly-selected
surge-current constraint is an efficient and effective method
to maintain the design reliability.
VII. CONCLUSIONS
We re-implement the method in [12] to turn on sleep
transistors one-by-one also in a 30ps time interval to
observe how the results change. The results are
demonstrated in Table II. Here, let us focus on the results of
[12]. Take C6288 for an example, the wake-up time is
1.41ns and the energy loss is 32.54pico-coulomb. However,
the maximum surge current of C6288 is 62.65mA, which
violates the surge-current constraint, 36.72mA.
We have presented an effective and practical wake-up
schedule, IWTM, on the DSTN structure for wake-up time
minimization. The main idea of our schedule is to apply the
strictly decreasing property of the virtual ground on the
surge current estimation. We also minimize the energy loss
from spurious glitches by IWTM considering physical
implementation issues. The results show that our IWTM
schedule, compared with [12], can achieve 261 times
greater wake-up time reduction and reduce energy loss by
30% more.
Based on the experimental results, 60.0% of the
benchmarks fail to satisfy the surge-current constraints.
Therefore, the method in [12] may run the risk that the
maximum surge current exceeds the constraint especially
when a short time period is applied. We would like to point
out this potential danger arises from the inability to estimate
the surge current.
ACKNOWLEDGEMENT
The authors would like to thank the National Science
Council of Taiwan (NSC-97-2220-E-007-013) for
supporting this project.
C. Package RLC Parasitics
To further consider the RLC effects, we applied DIP-40
package model in IWTM. For simplicity of discussion, we
select C7552 as the representative to elaborate the
10
[22] K. Shi, and D. Howard, “Sleep Transistor Design and
Implementation - Simple Concepts Yet Challenges To Be Optimum,”
Proc. of the VLSI-DAT, pp. 1-4, 2006.
[23] J. P. Uyemura, Introduction to VLSI Circuits and Systems, John
Wiley & Sons, Inc, 2002.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
A. Abdollahi, F. Fallah, and M. Pedram, “A Robust Power Gating
Structure and Power Mode Transition Strategy for MTCMOS
Design,” IEEE Transaction on VLSI systems, vol. 15, no. 1, pp. 80 –
89, Jan 2007.
A. Abdollahi, F. Fallah, and M. Pedram, “Leakage current reduction
CMOS VLSI circuits by input vector control,” IEEE Transaction on
VLSI systems, vol. 12, no. 2, pp. 140-154, Feb. 2004.
K. Agarwal, H. Deogun, D. Sylvester, K. Nowka, “Power Gating
with Multiple Sleep Modes,” Proc of the ISQED, pp. 633-637, 2006.
M. Anis, S. Areibi, M. Mahmoud, and M. Elmasry , “Dynamic and
Leakage Power Reduction in MTCMOS Circuits Using an
Automated Efficient Gate Clustering Technique,” Proc. of the DAC,
pp. 480-485, 2002.
M. L. Bushnell, and V. D. Agrawal, Essentials of Electronic Testing
for Digital, Memory and Mixed-Signal VLSI Circuits, Lucent
Technologies and Michael L. Bushnell, 2000.
H. Chang and S. S. Sapatnekar, “Full-Chip Analysis of Leakage
Power Under Process Variations, Including Spatial Correlations,”
Proc. of the DAC, pp. 523-528, 2005.
Y. T. Chen, D. C. Juan, M. C. Lee, and S. C. Chang, “An efficient
wake-up schedule during power mode transition considering spurious
glitches phenomenon,” Proc. of International Conference on
Computer-Aided Design, ICCAD, pp. 777-782, 2007.
D. S. Chiou, S. H. Chen, S. C. Chang, and C. Yeh, “Timing Driven
Power Gating,” Proc. of the DAC, pp. 121-124, 2006.
D. S. Chiou, D.C. Juan, Y.T. Chen, and S.C. Chang, “Fine-Grained
Sleep Transistor Sizing Algorithm for Leakage Power
Minimization,” Proc. of Design Automation Conference, DAC, pp.
81-86, 2007.
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein,
Introduction to Algorithms - Second Edition, The MIT Press, 2001.
M. Keating, D. Flynn, R. Aitken, A. Gibbons, and K. Shi, Low
Power Methodology Manual for System-on-Chip Design, Springer,
Inc, 2007.
S. Kim, S. V. Kosonocky, and D. R. Knebel, “Understanding and
Minimizing Ground Bounce During Mode Transition of Power
Gating Structures,” Proc. of the ISLPED , pp. 22 – 25, Aug, 2003.
F. Li, L. He, and K. Saluja, “Estimation of Maximum Power-up
Current,” Proc. of the ASP-DAC, pp. 51-56, 2002.
C. Long, and L. He, “Distributed Sleep Transistor Network for
Power Reduction,” IEEE Transaction on VLSI systems, vol. 12, no. 9,
pp. 937 – 946, Sep. 2004.
C. Long, J. Xiong, and L. He, “On Optimal Physical Synthesis of
Sleep Transistors,” Proc. of the ISPD, pp. 156-161, 2004.
S. Mutoh, S. Shigematsu, Y. Matsuya, H. Fukuda, T. Kaneko, and J.
Yamada, “A 1-V Multithreshold-Voltage CMOS Digital Signal
Processor for Mobile Phone Application,” IEEE JSSC, vol. 31, no. 11,
pp. 1795 – 1802, Nov. 1996.
A. Ramalingam, A. Devgan, and D. Z. Pan, "Wakeup Scheduling in
MTCMOS Circuits using Successive Relaxation to Minimize Ground
Bounce", ASP Journal of Low Power Electronics (JOLP), Vol 3, No.
1, April 2007.
R. R. Rao, A. Devgan, D. Blaauw, and D. Sylvester, “Parametric
Yield Estimation Considering Leakage Variablity,” Proc. of the DAC,
pp. 442-447, 2004.
K. Roy, S. Mukhopadhyay and H. M. Meimand, “Leakage Current
Mechanisms
and
Leakage
Reduction
Techniques
in
Deep-Submicrometer CMOS Circuits,” Proc. of the IEEE, pp. 305 –
327, Feb. 2003.
P. Royannez, H. Mair, F. Dahan, M. Wagner, M. Streeter, L. Bouetel,
J. Blasquez, H. Clasen, G. Semino, J. Dong, D. Scott, B. Pitts, C.
Raibaut, U. Ko, “90nm Low Leakage SoC Design Techniques for
Wireless Applications”, Proc of the ISSCC, pp 138-140, 2005.
A. Sagahyroon, and F. Aloul, “Maximum Power-Up Current
Estimation in Combinational CMOS Circuits,” Proc. of the IEEE
MELECON, pp. 70 – 73, May 2006.
Da-Cheng Juan received the B.S. degree and
M.S. degree from National Tsing Hua University,
HsinChu, Taiwan, in 2005 and 2007 respectively,
both in computer science.
His research interests include low-power
related topics, electronic design automation
(EDA) techniques, and algorithm design. Mr.
Juan was the recipient of the Academic
Achievement Award from the National Tsing
Hua University in 2003, Masterpiece prize of
National College Programming Contest in 2004, Two-Strait College
Programming Contest Award (third place) from National Tsing Hua
University in 2005. He is currently with the Department of Electronic
Engineering and Computer Science, National Tsing Hua University.
Yu-Ting Chen received the B.S. degrees in both
computer science and economics and the M.S.
degree in computer science from National Tsing
Hua University, HsinTsu, Taiwan, R.O.C., in
2005 and 2007 respectively.
His research interests include power-gating
design issues, low power circuit design
techniques, and related topics of electronic design
automation (EDA).
In 2005, he was with Taiwan Semiconductor
Manufacturing Company Ltd. (TSMC) as a summer intern. He is currently
with the Department of Electronic Engineering and Computer Science,
National Tsing Hua University.
Ming-Chao Lee received the B.S. degree in the
Department of Information and Computer
Engineering from the Chung Yuan Christian
University, ChungLi, Taiwan, R.O.C., in 2002,
the M.S. degree in Computer Science and
Information Engineering from the National Dong
Hwa University, Hualien, Taiwan, R.O.C., in
2004.
He is currently pursuing the Ph.D. degree in
the Department of Computer Science from the
National Tsing Hua University, HsinTsu, Taiwan, R.O.C. His current
research interests include VLSI electronic design automation, low-power
circuit design issues, and power-gating design techniques.
Shih-Chieh Chang (S’92–M’95) received the
B.S. degree in electrical engineering from the
National Taiwan University, Taiwan, R.O.C., in
1987, and the Ph.D. degree in electrical
engineering from the University of California,
Santa Barbara, in 1994. He is currently a
Professor with the Department of Computer
Science, National Tsing Hua University, Hsinchu,
Taiwan, R.O.C.
From 1995 to 1996, he worked with Synopsys,
Inc., Mountain View, CA. From 1996 to 2001, he joined the faculty with
the Department of Computer Science and Information Engineering,
National Chung Cheng University, Chiayi, Taiwan, R.O.C. His current
research interests include logic synthesis, functional verification for SoC,
and noise analysis. Dr. Chang was a recipient of a Best Paper Award at the
1994 Design Automation Conference.
11