Server-level Power Control

IBM Research and U. Tennessee, Knoxville
Server-level Power Control
Charles Lefurgy, Xiaorui Wang, Malcolm Ware
The 4th IEEE International Conference on Autonomic Computing
June 12, 2007
© 2007 IBM Corporation
IBM Research
Server power supplies
Label power: 308 W
Datacenter must wire to label power
P4Max: 273 W
The 4th IEEE International Conference on Autonomic Computing
Label power
50 W above
real workloads
June 12, 2007
idle
gzip-1
vpr-1
gcc-1
mcf-1
crafty-1
parser-1
eon-1
perlbmk-1
gap-1
vortex-1
bzip2-1
twolf-1
wupwiseswim-1
mgrid-1
applu-1
mesa-1
galgel-1
art-1
equake-1
facerec-1
ammp-1
lucas-1
fma3d-1
sixtrack-1
apsi-1
gzip-2
vpr-2
gcc-2
mcf-2
crafty-2
parser-2
eon-2
perlbmk-2
gap-2
vortex-2
bzip2-2
twolf-2
wupwiseswim-2
mgrid-2
applu-2
mesa-2
galgel-2
art-2
equake-2
facerec-2
ammp-2
lucas-2
fma3d-2
sixtrack-2
apsi-2
SPECJBB
LINPACK
p4max
label
However, real workloads do not use that
much power
350
300
250
200
150
100
50
0
2/17
Server Power (W)
IBM Research
The problem
Server power consumption is not well controlled.
– System variance (workload, configuration, process, etc.)
– Design for worst-case power
Results:
– Power supplies are significantly over-provisioned
– Therefore, datacenters provision for power that cannot be used
– High cost, with no benefit in most environments
3/17
The 4th IEEE International Conference on Autonomic Computing
June 12, 2007
IBM Research
Our approach
Use “better-than-worst-case” design
– Example: Intel’s Thermal Design Power (TDP)
– Power, like temperature, can be controlled
Reduce design-time power requirements
– Run real workloads at full performance
– Use smaller, cost-effective power supplies
Enforce run-time power constraint with feedback control
– Slow system when running power virus
4/17
The 4th IEEE International Conference on Autonomic Computing
June 12, 2007
IBM Research
Our contributions
Control of peak server-level power (to 0.5 W in 1 second)
Derivation and analysis [see paper]
– Guaranteed accuracy and stability
Verified on real hardware
Better application performance than previous methods
5/17
The 4th IEEE International Conference on Autonomic Computing
June 12, 2007
IBM Research
Caveats
Our prototype is a blade server
– The results of the study also apply to rack-mount servers.
Power controller uses clock throttling, not dynamic
voltage and frequency scaling (DVFS)
– At the time of the study, only clock throttling was available on our
prototype system.
– DVFS is not available on all processors (lower speed grades)
– Recently, we have built a prototype using DVFS
6/17
The 4th IEEE International Conference on Autonomic Computing
June 12, 2007
IBM Research
Rest of the talk
Power measurement
Power control
– Open loop controller
– Ad-hoc controller
– Proportional controller
Experimental results
Conclusions
7/17
The 4th IEEE International Conference on Autonomic Computing
June 12, 2007
IBM Research
Power measurement
Measure 12V bulk power
0.1 W precision, 2% error
HS20 8843 (Intel Xeon blade)
controller firmware on service
processor (Renesas H8 2168)
Measurement/calibration circuit
Sense resistors
8/17
The 4th IEEE International Conference on Autonomic Computing
June 12, 2007
IBM Research
Options for power control
Open-loop
– No measurement of power
– Chooses fixed speed for a given power budget
– Based on most power hungry workload
Ad-hoc
– Measures power and compares to power
budget
– +1/-1 adjustments to processor clock throttle
register
Proportional Controller (“P control”)
– Designed using control theory
– Guaranteed controller performance
9/17
The 4th IEEE International Conference on Autonomic Computing
June 12, 2007
IBM Research
Open loop design
P4MAX workload used as basis for open-loop controller
Graph shows maximum 1 second power for workload
P4MAX
LINPACK
SPECJBB 2005
270
Power (W)
250
230
SPECCPU 2-threads
SPECCPU 1-thread
210
190
170
150
Idle
100.0%
87.5%
75.0%
62.5%
50.0%
37.5%
25.0%
12.5%
130
Processor performance setting (effective frequency)
10/17
The 4th IEEE International Conference on Autonomic Computing
June 12, 2007
IBM Research
Proportional controller design
Settle to within 0.5 W of desired power in 1 second
– Based on BladeCenter power supply requirements
Every 64 ms
– Compare power to target power
– Use proportional controller to select desired processor speed
• 12.5% - 100% in units of 0.1%
Clock throttling
– Intel processor: 8 settings in units of 12.5% (12.5% - 100%)
– Use delta-sigma modulation to achieve finer resolution
11/17
The 4th IEEE International Conference on Autonomic Computing
June 12, 2007
IBM Research
Why not use ad-hoc control?
Server power (W)
power 1ms
power 64ms
power 1s
250
240
230
220
210
200
190
38
39
40
Time (s)
41
42
Performance state
38
39
40
Time (s)
Settles to 216.0 W
CPU speed: 68.8%
41
42
5 W Violation
The 4th IEEE International Conference on Autonomic Computing
P Controller
250
240
230
220
210
200
190
overshoot
38
Performance state
100%
90%
80%
70%
60%
50%
40%
12/17
Server power (W)
Set point = 211.0 W
Ad-hoc
39
100%
90%
80%
70%
60%
50%
40%
40
Time (s)
41
42
overshoot
38
39
40
Time (s)
Settles to 211.0 W
CPU speed: 65.8%
41
No violation
June 12, 2007
42
IBM Research
Steady-state error
P controller has no steady-state error (x=y)
Ad-hoc controller has steady-state error
– Add safety margin of 6.1 W to ad-hoc
Average power measured .
over 66 seconds (W)
270
260
250
240
230
220
210
p-controller
200
190
ad-hoc
180
ad-hoc with safety
margin
170
180
13/17
Above y=x
Violation of power budget
190
200 210 220 230 240
Controller power setpoint (W)
The 4th IEEE International Conference on Autonomic Computing
250
260
June 12, 2007
IBM Research
Comparison of 3 controllers
Run each controller with 5 power budgets
Compare throughput of workloads
Table shows settings used for each controller
Power
budget
Open-loop processor
performance setting
Ad-hoc (with safety
margin) set point
P control
set point
250 W
75%
238.9 W
245.0 W
240 W
62.5%
229.1 W
235.2 W
230 W
62.5%
219.3 W
225.4 W
220 W
50%
209.5 W
215.6 W
210 W
37.5%
199.7 W
205.8 W
14/17
The 4th IEEE International Conference on Autonomic Computing
June 12, 2007
IBM Research
Application performance summary
P controller
– 31-82% higher performance than open-loop
– 1-17% higher performance than ad-hoc
• Quicker settling time
• Zero steady state error
1 W = 1.1% performance!
Application throughput
(% of full-performance)
100%
LINPACK
99%
92%
80%
70%
60%
40%
P controller
Improved ad-hoc
Open-loop
20%
0%
210
220
230
240
Budget (W)
15/17
The 4th IEEE International Conference on Autonomic Computing
June 12, 2007
250
IBM Research
Power supply reduction
308 W: Label power of HS20 blade
260 W: Real workloads run at full performance
– A reduction of 15% in supply power.
Fit 15% more servers per circuit
16/17
The 4th IEEE International Conference on Autonomic Computing
June 12, 2007
IBM Research
Conclusions
Power is a 1st class resource that can be managed.
– Power is no longer the accidental result component
configuration, manufacturing variation, and workload.
Reduce power supply capacity, safely.
– Relax design-time constraints, enforce run-time constraints.
– Install more servers per rack.
Power control is a fundamental mechanism for power
management in a power-constrained datacenter.
– Move power to critical workloads.
17/17
The 4th IEEE International Conference on Autonomic Computing
June 12, 2007