FP utvecklades på 70-talet av John Backus - IDt

Examensarbete
Dataingenjörsprogrammet -98
Institutionen för datateknik
Mälardalens högskola
Beatrice Jakobsson 751017-0223 [email protected]
Handledare: Raimo Haukilahti, [email protected]
Examinatior: Lennart Lindh
DPM
Dynamic power management
Rapport
2002-03-25
SUMMARY
Dynamic Power Management (DPM) is a technique to reduce power consumption of
electronic systems. The most common way to do that is to shut down idle components. There
are four groups of approaches: timeout, predictive, stochastic and OS-based power
management. A couple algorithms from each group are summarised. To more easily
implement some of the algorithms in this report the advanced configuration and power
interface, ACPI, can be used.
SAMMANFATTNING
Dynamic power management (DPM) är en teknik för att reducera energiåtgången hos
elektroniska system. Det vanligaste sättet att göra detta är att stänga av komponenter som är
idle. Det finns fyra grupper av tillvägagångssätt: timeout, prediktiv, stokastisk och OS-baserat
energisparande. Ett par algoritmer från varje grupp är sammanfattade. För att på ett lättare sätt
implementera vissa av algoritmerna i denna rapport kan ”advanced configuration and power
interface”, ACPI, användas.
FOREWORD
This thesis is a part of a master degree thesis within the computer engineer program in
Mälardalen University in Västerås, Sweden.
I would like to thank my supervisor Raimo Haukilahti at the department for computer
engineer for all help during this project. I would also like to thank my cat and horse for their
support and to help me clear my brain from time to time.
2
TABLE OF CONTENTS
1 INTRODUCTION .............................................................................................................................................. 4
2 DPM STRATEGIES .......................................................................................................................................... 5
2.1 TIMEOUT: ....................................................................................................................................................... 5
2.1.1 Adaptive timeout: ................................................................................................................................... 5
2.1.2 Device dependent timeout: ..................................................................................................................... 5
2.2 PREDICTIVE TECHNIQUES: .............................................................................................................................. 5
2.2.1 L-shape: ................................................................................................................................................. 5
2.2.2 Exponential average: ............................................................................................................................. 6
2.2.3 Predictive wakeup: ................................................................................................................................. 6
2.2.4 Predictive shutdown: .............................................................................................................................. 6
2.2.5 Adaptive disk shutdown: ........................................................................................................................ 7
2.3 STOCHASTIC CONTROL: .................................................................................................................................. 8
2.3.1 Stochastic model .................................................................................................................................... 8
2.3.2 Sliding windows: .................................................................................................................................. 11
2.3.3 Competitive: ......................................................................................................................................... 11
2.3.4 Learning tree: ...................................................................................................................................... 11
2.4 OPERATING SYSTEM-DIRECTED POWER MANAGEMENT: .............................................................................. 13
2.4.1 Advanced configuration and power interface, ACPI ........................................................................... 13
2.4.2 Task-based power management: .......................................................................................................... 13
2.4.3 Task scheduling: .................................................................................................................................. 14
2.5 SYMBOLS AND MEANINGS: ........................................................................................................................... 15
2.6 DISK PARAMETERS: ...................................................................................................................................... 15
3 CONCLUSIONS .............................................................................................................................................. 15
4 REFERENCE ................................................................................................................................................... 15
3
1 INTRODUCTION
Dynamic Power Management (DPM) is a technique to reduce power consumption of
electronic systems. The widespread use of cellular phones, laptop computers and other
portable electronic appliance (battery powered system) require the energy use to be low to
extend their operating time. It is also important to reduce the environmental impact (e.g.
cooling induce noise) and keep a high performance. Selectively shutting down idle
components or slowing the system down can achieve this. This report only comprises
algorithms that shut down devices. The most common device to be shutdown is the hard disk
because it takes time to let the plates spin up and down and the hard disk is not always needed
when computers are running if the information needed is in the physical memory. Other
device to shut down can be an I/O controller, displays and network interface cards. The
quality of these algorithms mostly depends on the knowledge of user’s behaviour, which in
many cases is unknown or non-stationary. Because of that a good DPM algorithm should be
capable of adapting to changes in users behaviour.
The breakeven time is the minimum length of an idle period to save power. Every power
state change will produce an overhead because shutdown and wakeup cost extra power. The
minimum sleeping time is the time a device has to stay in sleeping state to compensate
overhead.
A threshold can be defined as the minimum idle time to reach the break-even time.
The Oracle power manager (OPM) is the best to handle dynamic power problems. OPM
knows perfectly when the next request returns. It shut down the system immediately if the
breakeven time is shorter then the idle period. This algorithm saves most power but do not
exist in reality. OPM uses as a reference point for comparison with other algorithms and can
be simulated by analysing a request trace offline.
DPM
Timeout
Adaptive
Predictive
Device
dependent
Stochastic
Sliding
window
Competitive
OS-directed
Learning
tree
Task-based
L-shape
Exponential
average
Predictive
wakeup
Adaptive
disk shutdown
Task
scheduling
Predictive
shutdown
4
2 DPM STRATEGIES
The DPM algorithms can be classified in to three different groups: time-out, predictive and
stochastic. There can also be OS-directed power management. Policy optimisation involves
not only the choice of when to perform state transitions but also the choice of which transition
should be performed.
2.1 Timeout:
The system shuts down after a certain timeout value £. In some cases the user can do some
adjustments and change that time but most often the time is fix. This approach wastes energy
because the system remains on a certain time even if there are no incoming requests. The
system wakes up when a new request arrive.
2.1.1 Adaptive timeout:
1) Adjust the timeout value by itself depending on the ratio between £ and the latest idle
period. If the ratio is small, £ increases, and when the ratio is large, £ decreases.
2) £ is updated asymmetrically. Either by increasing £ by one second or by decreasing it by
half a second.
3) £ change according to the latest busy period. If that period is short, £ decreases and if it is
long, £ increases [12].
2.1.2 Device dependent timeout:
This algorithm chooses a timeout value based on the breakeven time of the device under
control. If a £ as big as the breakeven time is used, the algorithm can use maximum twice the
energy as the OPM. This algorithm may have some connection with the c-competitive
algorithm (described in 2.3.3) but there is not so much written about it [8].
2.2 Predictive techniques:
These algorithms predict the length of an idle period before it starts. If the idle period is
predicted to be longer than the break-even time the device sleeps right after it is idle.
2.2.1 L-shape:
A long idle period is followed by short busy periods and inverse. The situation with short busy
periods followed by short idle periods is not handled, the situation in the down left corner of
the L. This algorithm is good on well-traced system that proves to have this type of workload
[8].
a
t
b
t
c
t
5
Figure 1: a) Long busy periods are followed by short idle periods. b) Short busy periods are
followed by long idle periods. c) Short bust periods followed by short idle periods is not
handled.
Figure 2: Lshape
2.2.2 Exponential average:
This predictive algorithm uses both the predicted and the actual lengths of a previous idle
period to predict the next idle period. If P(n) is the prediction for the next idle period I(n),
P(n+1) = a*I(n)+(1-a)*p(n). The constant a is between 0 and 1. P(n+1) is an average of
previous idle periods with exponential weights. If P(n) is longer then the break even time the
device shuts down. a is used to control the relative weight of recent and past history in the
predictions. If a = 0, only the last predicted value counts and the recent history has no effect. If
a=1, the prediction only takes the most recent idle period into account but ignores the previous
predictions. This algorithm also has a system for correction of prediction misses [6].
2.2.3 Predictive wakeup:
To reduce the performance penalty cost that is always paid on wakeup the power manager
performs a predictive wakeup. The system wakes up when the expected idle time expires,
even if no new requests have arrived. If Tidle has been under predicted this choice may
increase power dissipation, but the delay for servicing the first incoming request after an idle
period is decreased [1].
2.2.4 Predictive shutdown:
Take the decision to shutdown based on observation of the previous idle- and busy periods.
As soon as a new idle period begins a new decision is taken. If the result of a certain equation
is bigger then the break-even time the system shuts down. Golding(96)
Take the decision to shutdown based on observation of the recent busy period. As soon as a
new idle period begins a new decision is taken. If the busy equation is smaller than the
threshold it will be a state change. This approach fits best on L-shape diagram, when short
busy periods are followed by long idle periods. Srivastava(96)
6
2.2.5 Adaptive disk shutdown:
The algorithm don’t check and take a decision for every incoming request, instead it uses
sessions. A session is group of requests near in time, and a certain time called a threshold
separates two sessions. The algorithm predicts the length of the sessions and shutdown the
hard disk between sessions.
Figure 3: shows how sessions (grey bars) looks if the threshold is 1, 2, 3 or 4 seconds long.
Every request is shown as arrows on the time axis.
There are three problems with this:
-A decrease in performance while waiting for the plates to spin up.
-Extra energy requires while accelerating the plates
-Higher failure rate
The length of the sessions counts from the first request to the last one in that particular
session. Every session begins with a request separated from the previous one by a long idle
time, the threshold. To use the right threshold, analysis of traces most be done. A one-week
trace is long enough to compute an appropriate value. If a state change should be made the
three conflicting goals must be considered; low power consumption, high performance and
low failure rate. If the disk is sleeping and the in-queue is empty, the disk keeps on sleeping,
but if there are requests in the queue the disk must spin up. If the disk is spinning and a
request arrive it keeps on spinning, in this moment the PM increases the predicted length of
the current session. The difficulty arises when the disk is spinning and the in queue is empty.
The PM decreases the predicted session length by an adjustment parameter instead of
shutdown the system immediately. When the in queue has been empty long enough compared
to the predicted session length the PM issues a shutdown.
/*PL/AL: predicted/actual session length*/
/*a: attenuation factor*/
/*SE: predicted session end time*/
/*Th: Threshold*/
/*inc: increment constant*/
switch(state)
{
case spinUp:
state = spinning,
PL = a * PL + (1-a) *AL;
SE = now + PL;
break;
case spinDown:
state = sleeping
7
break;
case sleeping:
if (a request arrives)
{ state= spinUp; } break;
case ?spinDown:
if ((now > SE) && ((now-SE)/PL) / Th2))
{ state = sleeping; }
else
{ state = spinning } break;
case spinning:
if (a request arrives)
if((now > SE) && ((now - SE) / PL < Th1)
{ PL += inc1;
SE += inc2; }
else
{ state = ?spinDown;
PL -= inc1;
SE -= inc2; } break;
}
Figure 4: Adaptive algorithm
The adjustable parameter have affect on the performance of the algorithm, if it is too small the
algorithm becomes a fixed timeout because the predicted length is adjusted to slow. If the
parameter is too large the algorithm becomes very sensitive to time variations between two
consecutive requests [3].
2.3 Stochastic control:
There are some stochastic process that is not handled in this report such as continuous-time
Markov process, discrete-time Markov process and Time-index semi-Markov [14].
2.3.1 Stochastic model
When device and requests are modelled as stochastic processes it provides the flexibility to
trade off between power and performance. The stochastic model consists of the following
element [13]:
 A service requester, SR, which sends requests to the service provider. SR is models as
a Markov chain and has a transition matrix P. The simplest model is when one request
per period is handled, i.e. two states. If there was a request at time tn the probability to
get a request at time tn+1 is 0.88.
Service requestor
0,95
0,05
0
1
0,88
0,12
8
0
1
P =
0
1
0,95
0,05
0,12
0,88
Figure 5: The service requestor with two states and its transition matrix P.

A service provider, SP, serves incoming requests from a workload source. In each time
interval it can only be in one state. Each state is characterized by a level of
performance and by a power consumption level. Transitions between power states that
are done each period, are controlled by a power manager through commands. If there
is an s_on command when the model is in state off there is a probability of 0.97 to stay
in state off, and only 0.03 for moving to state on.
S_off/0,2
S_on/0,0
Service provider
S_off/0,8
S_on/1,0
on
off
S_off/1,0
S_on/0,97
S_off’/0,0
S_on/0,03
on
off
on
P(s_on) =
off
1
0
0,03 0,97
on
on
P(s_off) =
off
0,8
0
off
0,2
1
Figure 6: The service provider and its matrices.

A queue. The incoming requests during one period are buffered here. The queue can
have FIFO or some other discipline. The requests is processed and serviced within the
same period with a probability dependent on the power state of the SP. The queue
length is a Markov process with transition matrix P. If standing in state 0 and gets
command on and also get a request there is a probability of 0,2 to move to state 1, and
0,8 to stay in state 0.
9
-,0/0,0
On,1/0,2
Off,1/1,0
Queue
-,0/1,0
On,1/0,8
Off,1/0,0
0
1
On,0/0,2
On,1/1,0
Off,-/1,0
On,0/0,8
On,1/0,0
Off,-/0,0
0
1
1,0
0,8
0,0
0,2
0
1
0,8
0,8
0,2
0,2
0
1
1,0
0,0
0,0
1,0
0
1
0,0
0,0
1,0
1,0
0
P(on,0) =
1
Error!
0
P(on,1) =
1
0
P(off,0) =
1
0
P(off,1) =
1
Figure 7: The queue and transition matrices.


A power manager, PM, is a component which communicates with the SP and attempts
to set its state at the beginning of each period, by issuing commands chosen among a
finite set A. The PM contains all proper specifications and collects all relevant
information needed for implementing a power management policy. The states of the
system are s, and contain all states from SP, SR and the queue. s will be a Markov
chain with S = Sr x Sp x Sq whose transition matrix P(a) depends on the command a
issued to the SP by the PM. Hence, the system is fully described by a set of Na
transition matrices, one for each command.
Cost metrics. Some different cost metrics are “expected power consumption level”
c(sp, s) per unit time and “performance penalty” per unit time d(sq). The performance
penalty is dependent on the length of the queue, number of jobs in the queue. A natural
way to define the performance penalty is the queue length: d(s) = sq. There is also a
“request loss” per period b(sr, sq).
10
2.3.2 Sliding windows:
Sliding window is based on the stochastic model and is used for non-stationary service
requests. A sliding window (W) consists of a certain number of slots (WS) and each slot
stores one previous user request.
W(0)
1
W(1)
W(2)
1
0
W(3)
0
W(4)
W(5)
1
0
……………
………………
W(WS-2) W(WS-1)
1
0
Figure 8: Single sliding window
To predict future user requests it is important to hold old user request history and in this case
it is used to update transition matrix P. The update of P is done in several steps:
Let l = ∑ WS_1 k=1 ( W(k) = i )
“=” is a equivalence operation with a Boolean output and i comes from matrix P(i, j). Then
1 / l ∑ WS_1 k=1 [( W(k) = i ) ^ ( W(k) = j )] if l ≠ 0
P(i, j) = 0
if l = 0 and i = j
1 / (S-1)
otherwise
When this has been done four times the matrix P is updated. The diagonal elements of P are
called user request probabilities. The basic window operation is to shift one slot constantly
every time slice. Even the shutdown decision is evaluated each period when the device is in
sleeping state, thus causing overhead. There can be either a single- or a multi window
approach. When a multi window approach is used the number of windows is as many as the
state of the service requester [4][7].
2.3.3 Competitive:
This algorithm, “c-competitive”, is an on-line algorithm that can find a solution with cost less
than c times the cost generated by OPM. If £ is equal to the break-even time this algorithm is
proven to be 2-competitive. On-line algorithms whose performance is within the smallest
possible constant factor of the optimum offline, is said to be strongly competitive. The
competitive algorithm is said to be either “c-competitive against a weak adversary” or “ccompetitive against a strong adversary”. If the adversary is weak the request sequence is made
without regard to the non-deterministic choice made by the on-line algorithm. And if the
adversary is strong it chooses each input request depending on the choices made by the
algorithm in servicing the previous requests. Unfortunately there is not so much written about
this algorithm [7][12].
2.3.4 Learning tree:
The algorithm for adaptive learning tree can control multiple sleeping states and because of
that there will be multiple thresholds. Already at the beginning of a new idle period the
algorithm can determine the most appropriate low-power sleep state. A sequence of idle
11
periods is transformed in to a sequence of discrete events by an idle period clustering
technique.
Ii = (((pdi+1 – pi+1) * tdi+1 + (pui+1 – pi+1) * tui+1) / (pi – pi+1)) –
(((pdi - pi) * tdi + (pui -pi) * tui) / (pi – pi+1))
pi is power consumption while being in power state i. pd is power consumption level, and td
transition time, for transition from idle state to power state pi. The transition from power state
pi to idle state corresponds to pu and tu. This sequence transforms to integers that represent
the best power state that could be chosen for each idle period.
0
if tidle < I0
IG (tidle) = i+1 if Ii < tidle < Ii+1 for 0  i < n
n if In < tidle
The tree structure includes decision nodes, history branches, prediction branches and leaf
nodes. All nodes from left to right are given numbers from 0 to n. All leaf nodes are
predictions for the next idle period and store the Prediction Confidence Level (PCL). The
higher the PCL is, the higher the confidence for a prediction. A decision node always has
totally n branches, history or prediction. A decision is made with a path matching procedure.
All leaf nodes included in the path is checked and the one with highest PCL is selected. After
the path matching the index of the selected leaf node becomes the prediction for the next
event.
a
b
c
1
e
3
1
2
3
d
2
Figure 9: An adaptive learning tree whit two-sleep state. Decision nodes (circles), history
branches (solid lines), prediction branches (dashed lines), and leaf nodes (rectangles).
If a sequence s = (0, 1) then the path is a -> b -> e, and the leaf node with the highest PLC is
selected (the right one). Thus the tree predict IG(tidle) = 2 for the next idle period. When a
correct prediction has been made the tree should be updated to increase the possibility to
12
choose the same leaf node for the given sequence. This is achieved by updating the PCL of the
leaf nodes [5].
2.4 Operating System-directed Power Management:
Dynamic power management can be achieved in two different ways when it should be done
with OS-directed power management: adjusting CPU clock speed or putting device into sleep
state [9]. This report handles the last problem.
2.4.1 Advanced configuration and power interface, ACPI
ACPI is a uniform HW/SW interface for power management proposed by Intel, Microsoft and
Toshiba. It specifies an abstract and flexible interface between hardware components, such as
VLSI chips, disk drivers, display drivers etc., and the power manager. One key assumption
behind ACPI is that the power manager is a module of the operating system (OS). The ACPI
does not specify neither the power management policy nor the information that should be used
to drive the policy’s decisions. ACPI controls all power states in the whole system as well as
the power state of each device. An ACPI compliant system has five global states and an ACPI
compliant device has four states. The sleeping state is differentiated by the power consumed
and the time to wake up [2][10].
Application
Kernel
OS
PM
Device drivers
ACPI drivers
AML
interpreter
Table interface BIOS interface
ACPI tables
Register interface
ACPI BIOS
ACPI
ACPI
Registers
BIOS
Platform hardware
Motherboard device
Chipset
CPU
Figure 10: ACPI interface and PC platform
2.4.2 Task-based power management:
Task-based power management is a software-centric approach. It classifies requests according
to their sources and that provides more information to the power manager about future
request. It also uses the knowledge available in the operating system. Because of this TBPM
know when task is created, executed and terminated, it can separate task and it consider the
13
CPU time of task while deciding power state change. The TBPM uses a two-dimensional data
structure called the “device-requester utilization matrix” U. It also create a vector called the
“processor utilization vector” P. The matrix U stores the relation between devices and
requests. The rows store the device and they are systems dependent meanwhile the requests
are unlimited and stored in the columns. To update U the same approach as in exponential
average is used, P(n+1) = a*I(n)+(1-a)*p(n). In this case to get a satisfactory results the
variable a can be between 0.2 and 0.8. P contains the percentage of CPU time executing task
r, and it is updated based on sliding window, to compute the CPU time distributed among
processes. The size of the window should, as always, not be to big and not to small. It has to
be able to sample the execution of all processes and anyway be quickly to reflect workload
variations. In this example the window has a size of two minutes. To shut down a device the
break-even time must be bigger than the total utilization U(d). U(d) is the request rate for
device d created by all tasks and can be calculated by multiply U(d, r) and P(r) [9].
2.4.3 Task scheduling:
This algorithm uses task scheduling and tries to make as long idle periods as possibly to save
energy.
T1
1
2
3
T2
1
T3
2
3
1
2
time
idle
T1
T2
1
2
1
3
2
T3
3
1
2
time
idle
T
idle
Figure 11: Two schedules of three independent tasks. The second is made by task scheduling
algorithm.
Every task has a required device set, RDS, which is used in the task selection. RDS is defined
as ε(t, n), that tell what device is needed by a task in a certain time. The current RDS is called
εc. The task selections is done in three steps:
1. It selects a task that has the same RDS as εc. This is done to avoid possible state
transitions.
2. If all εc differing from RDS the scheduler tries to find a task that can shut down some
device that was previously busy.
3. If neither step succeeds the task with the best potential to save power in the future is
selected. This potential is calculated by:
max ε  ( Pw, d – Ps, d / Tbe, d – Id ( r + kε ))
14
This algorithm can also schedule multiple devices but of course it becomes more complex.
There can be one device in sleeping state and one in working state, but the idle periods for
each device are as long as possible [11].
2.5 Symbols and meanings:
Symbols
Meanings
Ps
Power in sleeping state
Pw
Power in working state
Tsd
Shutdown delay
Esd
Energy to shut down
Twu
Wakeup delay
Ewu
Energy to wakeup
Tbe
Break-even time
Eo
Total energy to shut down and wake up
To
Total delay for shut down and wake up
ε
Required device set (RDS)
εc
Current RDS
Table 1: Symbols and meanings
2.6 Disk parameters:
Different disk parameters for an IBM DTTA 350640 hard disk, Fujitsu MHF 2043AT hard
disk and a Hitachi DK23AA-60 hard disk. To wake up a disk takes approximately 2-48 times
as much energy and time as it takes to shut down the disk [5][8].
Model
Ps (watt) Pw (watt)
IBM
0,75
3,48
Fujitsu
0,13
0,95
Hitachi
0,39
0,78
Table 2: Disk parameters
To (sec)
7,48
1,93
10,72
Eo (J)
53,58
4,82
17,83
Tbe
17,6
5,43/6,39
35,0
3 CONCLUSIONS
In this report fourteen different DPM algorithms described. Some of them are simple timeout
strategies and some are very complex models that can handle multiple sleeping state or/and
multiple devices. A few of them are described very little, it may be more to find with more
research. This report also briefly describes the advanced configuration and power interface
that makes implementation of DPM algorithms easier.
4 REFERENCES
[1] L. Benini, A. Bogliolo, G. D. Micheli. A survey of design techniques for system-level
dynamic power management. IEEE Transactions on VLSI systems, March 2000.
[2] L. Benini, A. Boglino, S. Cavallucci, B. Ricco´. Monitoring system activity for OSdirected dynamic power management.
[3] E.-Y. Chung, G. D. Micheli. Adaptive hard disk power management on personal
computer.In Great lakes symposium on VLSI, pages 50-53, 1999.
[4] E.-Y. Chung, L.Benini, A. Bogliolo, G. D. Micheli. Dynamic power management for nonstationary service requests. In Design automation and test in Europe, 1999.
15
[5] E.-Y. Chung, L.Benini, G. D. Micheli. Dynamic power management using adaptive
learning tree. In international conference on computer-aided design, pages 274-279, 1999.
[6] C.-H. Hwang, A. C. Wu. A predictive system shutdown method for energy saving of event
driven computation. In International conference on computer-aided design pages 28-32, 1997.
[7] A. Karlin, M. Manasse, L. McGeoch, S. Owicki. Competitive randomized algorithms for
non-uniform problems. Algorithmica, 11(6):542-571, June 1994.
[8] Y.-H. Lu, G. D. Micheli. Comparing system-level power management policies. In IEEE
Design & Test of Computers, 18, 2, 10-19, March, 2001.
[9] Y.-H. Lu, L. Benini, G. D. Micheli. Operating-system directed power reduction. In
International Symposium on Low Power Electronics and Design, Stanford University, 37-42,
July, 2000.
[10] Y.-H. Lu, T. Simuni´c, G. D. Micheli. Software controlled power management.
International workshop on hardware/software codesign, pp.157-161, 1999.
[11] Y-H. Lu, L. Benini, G. D. Micheli. Low-power task scheduling for multiple device. Proc.
Intl. Workshop on Hardware/Software Codesign, pp. 39--43, 2000.
[12] Y.-H. Lu, E.-Y. Chung, T. Simuni´c, L. Benini, G. D. Micheli. Quantitative comparison
of power management algorithms. In Design automation and test in Europe, 2000.
[13] G. A. Paleologo, L. Benini, A. Bogliolo, G. D. Micheli. Policy optimazation for dynamic
power management.
[14] Q. Qiu, M. Pedram. Dynamic power management based on continuous-time Markov
decision process. In Design automation conference, pages 555-561, 1999.
16