Managing Server Energy and Operational Costs

Managing Server Energy and
Operational Costs
Chen, Das, Qin, Sivasubramaniam,
Wang, Gautam
(Penn State)
Sigmetrics 2005
Introduction
 Usual arguments
– Variable workloads
– High operational cost even at low workloads
– Power consumption a serious concern
 WORK NOT IN CONTEXT OF
VIRTUALIZATION
…Introduction
 New points:
– Server on/off and DVS has been used for
power management BUT no consideration for
impact of on/off on reliability of hardware
– Authors claim no previous paper considered
SLA violations – only energy optimization
(maybe true in 2005!)
…Introduction
 Three new approaches for server
provisioning with both DVS and server
on/off, with response time SLAs considered,
and with cost of server on/off:
– First queuing model based
– Second feedback control theory based
– Third hybrid of above two
System model




Cluster with identical servers (hosts)
Any host can run any application
Applications run on multiple host
Assuming web front end each HTTP request
is routed to one of the web servers
– Service time is related to the request parameter
(e.g. file size)
 We will control the number of servers
allocated to a given application
…System model
 SLA used: average response time
 Goal: uses minimum energy required to
meet SLA
 Two options for power management
– Server on/off: costs in terms of (1) time and
energy required to boot up (2) wear and tear of
components (esp disk)
– Dynamic voltage-frequency scaling of CPU
…System model
 Assume l frequency levels: f1, f2,…fl
 Assume “centralized” DVS control:
– Single DVS setting for all hosts running one
application
 Assumption: Only one application on one
server
 Solution required for two steps: (1) how
many servers per app (2) Freq setting for
servers of each app
…Problem Formulation
 M identical servers
 N different applications
 Each appl i is allocated mi servers at any
time
 Max/min frequency for each server:
– fmax, fmin
…Problem Formulation
(Power Model)
 Dynamic power consumption of CPU
operating at frequency f is proportional to
V2*f
 Further, V is proportional to f
 Thus, model of CPU operating at frequency
f is:
– Pfixed + Pf*f3
 Energy consumption of the cluster at
frequency f over time t:
…Problem Formulation
 M and f are controlled periodically with
interval t
 Over Z such intervals of time t, energy
consumption is:


 Where mi(z) is the number of servers
allocated to application i during duration z
…Problem Formulation
 K$ is the per unit electricity cost
 Total cost:


 B0 is the cost per server turn-on cycle. Then
total cost:

…Problem formulation
 B0 itself has two components: power
required to turn on, and “MTBF” impact cost
due to the turn on
 Pmax is power consumed at max frequency,
Treboot is time taken for rebooting, Cr is MTBF
cost
 (MTBF = Mean Time Between Failures)
…Problem Formulation
 Objective Function to Minimize, under
constraints
…Constraints
Solutions.
First: Queuing Theoretic
 Assumptions:
 Number of servers is managed every time T
 Frequency level per application controlled
every time t
 T is an integer multiple of t, T= S*t
 Optimization period: U intervals of time T
– Each with S intervals of time t
Prediction
 Need the following parameters for each
application:
–
–
–
–
Mean arrival rate (l)
Squared coeff of interarrival times Ca2
Squared coeff of service times Cs2
Mean file size in bytes: f
 S-ARMA model used for prediction of arrival
parameters (seasonal autoregressive moving
average)
 “Winter’s” smoothing method for file size
parameters
Queueing Analysis
 Model the application on each server as a
G/G/1 queue. Use Bolch approximation




Optimization Problem
Heuristic Solution
 First assume t=T and consider only one interval at
a time. Then objective is to minimize




 Start solution for interval u by finding for all i, the
minumum number of servers mi(u) required for the
constraint Wi <= Wi,SLA using highest frequencies
 Do this for each interval
…Heuristic Solution
 Then try to reduce fi(u) and increase mi(u), if
that reduces cost.
 Select applications in decreasing order of
fi(u)
– i.e. select app with highest frequency, first.
…Heuristic Solution
 Now consider all intervals (still assume t=T)
 Start with “upper bound” solution of previous
round (upper bound because intervals were
considered separately)
…Heuristic solution
 Again consider one interval at a time (from
1st to last), but the total obj function
 In each interval, start from apps with highest
frequency and go in decreasing order
…Heuristic solution
 For each app, compare number of servers in this
interval with previous, and “level off” the number of
servers, if soln improves
– Search greedily where number of servers are closer to
number in previous intvl
– Then, tune frequencies to get a feasible solution
…Heuristic solution
 So far, t =T.
 For cases where T > t, use average arrival and
files size estimates for interval T and use above
solution
 Then given the mi(u) from this step, tune fi(u,s) for
each small interval s
Control Theoretic Approach
 Set up a feedback control formulation that finds an
“aggregate frequency” for each app i
– Objective: to meet response time SLA
 Solve the problem of allocating optimal number of
servers which achieve this aggregate frequency
– Objective: minimize power cost
…Control theoretic formulation
 Fi(u,s) =mi(u)* fi(u,s): aggregate frequency
– Indexed as Fi(k), where k = 1,2,..U*S
 Implicit assumption: response time from
multiple servers with total capacity F is the
same as that from single server with capacity F
 Objective function:
RF and RW,i are relative weights
LQR method
 Modify decision variables to formulate problem as
well-known Linear Quadratic Regulator:
…Control theoretic approach
is the “control gain” calculated by
using standard LQR methods
 After above, use Integral Controller

Server Allocation
 Given Fi(u,s) we need to determine mi(u)
 Define m(u) =
– Total number of servers required for all apps
 Define F(u) =
– Total frequency required for all apps
 Algorithm allocates mi(u) in proportion to its
aggregate frequency
…server allocation
•Server frequency min,max is :
•Therefore, number of servers is bounded between:
where
and
Online server allocation algorithm
•Start with Equation 11, ignore the server on/off costs
•The remaining expression quantifies the tradeoff between
the cost of a new server vs cost of higher frequency
•Differentiate this w.r.t. m(u) and find minima– get number of
servers that minimize the cost
… Online server allocation algorithm
 This m*(u) will be a minimum – adding even
one more server will increase the cost,
without even considering the cost of server
on (if any)
 Now consider two cases:
– m(u-1) is greater than m*(u) or
– Else turn on additional servers with the following
algo…
Online server allocation algo
…online server allocation
Where D denotes rounding of to available discrete frequency levels