Slide 1
Managing Energy and
Server Resources in Hosting
Centers
Jeff Chase, Darrell Anderson, Ron Doyle,
Prachi Thakar, Amin Vahdat
Duke University
Slide 2
Back to the Future
Return to server-centered computing: applications run as
services accessed through the Internet.
– Web-based services, ASPs, “netsourcing”
Internet services are hosted on server clusters.
– Incrementally scalable, etc.
Server clusters may be managed by a third party.
– Shared data center or hosting center
– Hosting utility offers economies of scale:
• Network access
• Power and cooling
• Administration and security
• Surge capacity
Slide 3
Managing Energy and Server Resources
Key idea: a hosting center OS maintains the
balance of requests and responses, energy inputs,
and thermal outputs.
1. Adaptively provision server
resources to match request load.
energy
US in 2003:
22 TWh
($1B - $2B+)
2. Provision server resources for
energy efficiency.
3. Degrade service on
power/cooling failures.
requests
Power/cooling “browndown”
Dynamic thermal management
[Brooks]
responses
waste heat
Slide 4
Contributions
Architecture/prototype for adaptive provisioning of server
resources in Internet server clusters (Muse)
– Software feedback
– Reconfigurable request redirection
– Addresses a key challenge for hosting automation
Foundation for energy management in hosting centers
– 25% - 75% energy savings
– Degrade rationally (“gracefully”) under constraint (e.g., browndown)
Simple “economic” resource allocation
– Continuous utility functions: customers “pay” for performance.
– Balance service quality and resource usage.
Slide 5
Static Provisioning
Dedicate fixed resources per customer
Typical of “co-lo” or dedicated hosting
Reprovision manually as needed
Overprovision for surges
– High variable cost of capacity
How to automate resource provisioning
for managed hosting?
Slide 6
Throughput (requests/s)
Load Is Dynamic
M
T
W
Th
F
S
S
ibm.com external site
• February 2001
• Daily fluctuations (3x)
• Workday cycle
• Weekends off
0
Throughput (requests/s)
0
Time (one week)
Week 6
0
0
Time (two months)
7
8
World Cup soccer site
• May-June 1998
• Seasonal fluctuations
• Event surges (11x)
• ita.ee.lbl.gov
Slide 7
Adaptive Provisioning
- Efficient resource usage
- Load multiplexing
- Surge protection
- Online capacity planning
- Dynamic resource recruitment
- Balance service quality with cost
- Service Level Agreements (SLAs)
Slide 8
Utilization Targets
i = allocated server resource for service i
i = utilization of i at i’s current load i
target = configurable target level for i
Leave headroom for load spikes.
i >target : service i is underprovisioned
i <target : service i is overprovisioned
Slide 9
Muse Architecture
Executive
configuration
commands
Control
performance
measures
offered
request load
storage
tier
reconfigurable
switches
Executive controls mapping of service
traffic to server resources by means of:
• reconfigurable switches
• scheduler controls (shares)
server pool
stateless
interchangeable
Slide 10
Server Power Draw
866 MHz P-III SuperMicro 370-DER (FreeBSD)
Brand Electronics 21-1850 digital power meter
boot
136w
CPU max
120w
CPU idle
93w
watts
off/hiber
2-3w
Idling consumes
60% to 70% of peak
power demand.
disk spin
6-10w
work
Slide 11
Energy vs. Service Quality
Active set = {A,B,C,D}
A
A
B
B
C
Active set = {A,B}
D
i <target
i =target
• Low latency
• Meets quality goals
• Saves energy
Slide 12
Energy-Conscious Provisioning
Light load: concentrate traffic on a minimal set of servers.
– Step down surplus servers to a low-power state.
• APM and ACPI
– Activate surplus servers on demand.
• Wake-On-LAN
Browndown: can provision for a specified energy target.
Slide 13
Resource Economy
Input: the “value” of performance for each customer i.
– Common unit of value: “money”.
– Derives from the economic value of the service.
– Enables SLAs to represent flexible quality vs. cost tradeoffs.
Per-customer utility function Ui = bid – penalty.
– Bid for traffic volume (throughput i).
– Bid for better service quality, or subtract penalty for poor quality.
Allocate resources to maximize expected global utility
(“revenue” or reward).
– Predict performance effects.
– “Sell” to the highest bidder.
– Never sell resources below cost.
Maximize bidi(i(t, i))
Subject to i max
Slide 14
Maximizing Revenue
Consider any customer i with allotment i at fixed time t.
– The marginal utility (pricei) for a resource unit allotted or
reclaimed from i is the gradient of Ui at i.
Adjust allotments until price
equilibrium is reached.
pricei
The algorithm assumes that Ui is
“concave”: the price gradients
are non-negative and
monotonically non-increasing.
Expected
Utility
Ui(t, i)
Resource allotment
i
Slide 15
Feedback and Stability
Allocation planning is incremental.
– Adjust the solution from the previous interval to react to new
observations.
Allow system to stabilize before next re-evaluation.
– Set adjustment interval and magnitude to avoid oscillation.
– Control theory applies. [Abdelzaher, Shin et al, 2001]
Filter the load observations to distinguish transient and
persistent load changes.
– Internet service workloads are extremely bursty.
– Filter must “balance stability and agility” [Kim and Noble 2001].
Slide 16
“Flop-Flip” Filter
EWMA-based filter alone is not sufficient.
– Average At for each interval t: At = At-1 + (1-)Ot
– The gain may be variable or flip-flop.
Load estimate Et = Et-1 if Et-1 - At < tolerance
else Et = At
Stable
Responsive
80
Utilization (%)
100
60
40
Raw Data
20
EWMA (a=7/8)
Flop-Flip
0
0
300
600
Time (s)
900
1200
Slide 17
IBM Trace Run (Before)
350
Throughput
Pow er
Latenc280
y
2000
1500
210
1000
140
500
70
Power draw (watts)
Po wer Draw (watts),
Latency
(ms*50)
cy (ms x50)
Laten
Throughput (requests/s)
Th rough put (requests/s)
2500
1 ms
0
0
155
310
Tim e (m inute s)
465
0
620
Slide 18
IBM Trace Run (After)
350
Throughput
Power
2000
Latency280
1500
210
1000
140
500
70
Power Draw (watts),
Latency (ms x50)
Throughput (requests/s)
2500
1 ms
0
0
155
310
Time (minutes)
465
0
620
Slide 19
Evaluating Energy Savings
Trace replay shows adaptive provisioning in action.
Server energy savings in this experiment was 29%.
– 5-node cluster, 3x load swings, target = 0.5
– Expect roughly comparable savings in cooling costs.
• Ventilation costs are fixed; chiller costs are proportional to
thermal loading.
For a given “shape” load curve, achievable energy savings
increases with cluster size.
• E.g., higher request volumes,
• or lower target for better service quality.
– Larger clusters give finer granularity to closely match load.
Slide 20
Expected Resource Savings
80
Savings (%)
60
40
World Cup (two month)
20
World Cup (month 2)
World Cup (week 8)
IBM (week)
0
0
4
8
Max Servers
12
16
Slide 21
Conclusions
Dynamic request redirection enables fine-grained,
continuous control over mapping of workload to physical
server resources in hosting centers.
Continuous monitoring and control allows a hosting center
OS to provision resources adaptively.
Adaptive resource provisioning is central to energy and
thermal management in data centers.
– Adapt to energy “browndown” by degrading service quality.
– Adapt to load swings for 25% - 75% energy savings.
Economic policy framework guides provisioning choices
based on SLAs and cost/benefit tradeoffs.
Slide 22
Future Work
multiple resources (e.g., memory and storage)
multi-tier services and multiple server pools
reservations and latency QoS penalties
rational server allocation and request distribution
integration with thermal system in data center
flexibility and power of utility functions
server networks and overlays
performability and availability SLAs
application feedback
Slide 23
Muse Prototype and Testbed
SURGE or trace
load generators
Executive
Extreme
GigE switch
client cluster
LinkSys
100 Mb/s
switch
redirectors
(PowerEdge 1550)
faithful trace replay
+ synthetic Web loads
server CPU-bound
power
meter
server pool
FreeBSD-based redirectors
resource containers
APM and Wake-on-LAN
Slide 24
Throughput and Latency
saturated: i > target
100
i increases linearly with i
CPU (%)
80
60
overprovisioned: i > target
may reclaim: i(target - i)
40
Allocation
20
Usage
0
0
30
60
90
120
150
180
600
100
480
80
360
60
240
40
Throughput
120
20
Latency
0
0
30
60
90
Time (s)
120
150
0
180
Latency (ms)
Average per-request
service demand: i i / i
Throughput (requests/s)
Time (s)
Slide 25
An OS for a Hosting Center
Hosting centers are made up of heterogeneous
components linked by a network fabric.
– Components are specialized.
– Each component has its own OS.
The role of a hosting center OS is to:
– Manage shared resources (e.g., servers, energy)
– Configure and monitor component interactions
– Direct flow of request/response traffic
Slide 26
1500
3
1000
2
500
1
0
0
500
1000
Time (s)
0
1500
Allotment (servers)
Throughput (requests/s)
Allocation Under Constraint (0)
Slide 27
1500
3
1000
2
500
1
0
-100
100
300
500
700
Time (s)
900
1100
1300
0
1500
Allotment (servers)
Throughput (requests/s)
Allocation Under Constraint (1)
Slide 28
Outline
Adaptive server provisioning
Energy-conscious provisioning
Economic resource allocation
Stable load estimation
Experimental results
© Copyright 2026 Paperzz