EP - Microarch.org

KnightShiD: Scaling the Energy Propor.onality Wall Through Server-­‐Level Heterogeneity ENERGY PROPORTIONALITY TRENDS Energy Proportionatliy
0.8
0.6
0.4
0.2
0
Nov-07
Mar-09
HIGH(75+)
0.4
0.2
0.6
0.4
0.2
7000
6000
5000
4000
Mar-09
Time
Jul-10
Dec-11
0
3000
2000
20%
40%
60%
Utilization
80%
100%
0
Nov-07
Mar-09
Time
EVALUATION EP Trends Overview:
60%
•  Dynamic Range improvements stalled at ~80%
•  Similarly, EP also stalled ~80%
(Note the few servers with EP>80% are –LD
servers)
•  Large PG exists at low utilization, even with high
EP servers
•  Peak energy efficiency has outpaced lowutilization energy efficiency
50%
Scaling the Energy Proportionality Wall:
Efficiency @ 100%
Load
Efficiency @ 10%
Load
1000
Jul-10
Dec-11
Figure 3: Historical Trends for Dynamic Range, Energy Proportionality, Proportionality Gap, and Energy Efficiency
•  To improve EP in the future:
•  Improve LD, Target low-util. PG
•  Previous server-level low power modes are
inactive
•  Exploits idle periods à Improves DR
•  Now need for server-level active low power
modes
•  Exploits low utilization periods à
Improves LD and PG
KNIGHTSHIFT SERVER ARCHITECTURE Powerpeak ! Poweridle
DR =
Powerpeak
Linear Deviation:
Areaactual
LD =
!1
Arealinear
Areaactual ! Areaideal
EP = 1!
Areaideal
Proportionality Gap:
PGx% =
Poweractual@x% ! Powerideal@x%
Powerpeak
Metrics Overview:
DR only accounts for peak and idle power usage
EP is a better indication of server power usage
Servers can be superlinear (+LD), linear (LD=0), or sublinear (-LD)
EP is affected by both DR and LD
For a given DR, EP+LD < EPlinear < EP−LD, where Eplinear = DR
100%
80%
60%
40%
Actual
Linear
Ideal
20%
0%
0%
DR:
60%
LD:
Sublinear
20% 40% 60% 80% 100%
Utilization
Figures 2: Sublinear Energy Proportionality Curve
EP:
74%
Ave. Power:
52.6W
-0.10
10%
-0.05
0%
0.00
10% 20% 30% 40% 50%
Knight Capability
Figure 8: Average energy savings and EP/LD improvements to 291 SPECpower servers
Prototype Evaluation
•  Xeon-based primary server + Atom-based Knight
•  Wikipedia-based real-world benchmark
Trace-based Evaluation
•  9 day Institutional datacenter utilization traces
•  Email, File server, Student timeshare, Video streaming
•  G/G/k simulation
•  Time-varying arrival rate
•  Operating mode-dependent service rate and k (# of servers)
•  Validated against prototype implementation (2% error)
Energy Savings
Energy Proportionality[1]:
Dynamic Range:
20%
Energy Savings
Figures 1: Superlinear Energy Proportionality Curve
-0.20
-0.15
Norm. 95% Latency
20% 40% 60% 80% 100%
Utilization
-0.25
30%
Norm. 95% Latency
0%
Sync
0%
Awake
20%
40%
Wakeup
Actual
Linear
Ideal
EP:
53%
Ave. Power:
68.6W
Sleep
40%
DR:
60%
LD:
Superlinear
! Power Consumption "
60%
Energy
EP
LD
-0.30
Trace
Energy Savings
95% Latency Impact
ing low-power components (such as low-power mobile memory),
aludra
87.9%
40%
•  Introduces a server-level active low power mode solution to exploit low utilization periods
therefore, most components can scale.
email
85.5%
37%
•  Fronts a high-power primary server with low-power compute node, called the Knight
Modeling Power: Our power model is based on our prototype
girtab
87.2%
49%
system to allow us to compare and validate KnightSim. Through on•  Knight capability = fraction of throughput compared to primary server
msg-mmp
-6.7%
7%
msg-mx
7.2%
254%
line instrumentation, we collect the utilization vs power data for both
msg-store
34.5%
53%
the Knight and primary server. We use this utilization-power data in
nunki
67.7%
5989%
our simulations; whenever a Knight is active at a given utilization
scf
77.5%
46%
Knight Node
Server Node
we use the power consumption data collected from our prototype
wikibench
35.1%
21%
Motherboard
Motherboard
Knight. Similarly whenever the primary server is operating at a given
Table 1:
andsavings
latency impact
of KnightShift
on datacenter
utilization
Table
4:Energy
Energy
and latency
impact
wrt Baseline
of atraces
15%
!"#$
%&'($
utilization, we use the power consumption collected from the primary
Capable KnightShift system
Memory
Memory
Memory
Memory
5
100%
aludra
server in our
prototype.
Primary:
Wakeup,It is also possible to generalize the power
80%
email
Primary:model andsend
4
use aawake
linearmsg,
power model validated in [11].
ness (aludra, email, msg-mmp,60%
wikibench), we experience relatively
girtab
Flush memory
CPU
wait for data sync,
msg-mmp
to capture
the energy penalty of transitioning to/from 3 low response time impact (<10%).
CPU
CPU
state, sendIn orderprocess
requests
40%
msg-mx
sleep msg.,
Knight, we conservatively model the transition power as a constant 2
msg-store
For moderately bursty workloads
(girtab, msg-store, scf), we ex20%
enter low
powerduring the entire wakeup period equal to the peak transition
Chipset
nunki
power
perience
latency
impact
within
25%
of
the
Atom-based
Knight.
For
1
state
0%
scf
LAN SATA
power. We determined empirically that the peak transition power for
these workloads, the majority-20%
of the latency impact occurs during
wikibench
0
Chipset
Knight: the
Begin
primary server is 167W.
the transition
Knight to 10%
primary
server
Knight
20%
30%when
40%the50%
10%
20% 30%from
40%the50%
processingArrival Rate and Latency Estimation: Our datacenter traces
Capability
Knight requests
Capability that it cannot handleKnight
Simple
is
handling
until
the primary server
requests
Flush
LAN SATA
Figureis9:ready.
LatencyThese
and energy
sensitivity
analysis
to Knight
capability
Router
only have CPU andKnight:
I/O
utilization
per
second
without
individual
bursty
behaviors
tend
to
be
periodic, thus it would
memory and send
request information.sync
By msg
assuming a mean service time of 1 second 5 be possible for KnightShift to100%
learn day-to-day utilization patterns
aludra
each request,
wecoordination
can estimate a time-varying arrival rate through
and proactively switch to the primary
server to handle these highFigure for
5: KnightShift
runtime
email
80%
4
!
Time
"
girtab
our
utilization
trace.
For
example,
50%
utilization
would
correspond
utilization
bursty
periods,
negating
the
high
latency
impact.
This
60%
Power
msg-mmp
Disk
3
to an arrival rate of 50 requests per second. Through the simulated
topic is outside the scope of the40%
paper and will be explored in future
msg-mx
Power
Disk
msg-store
queueing model, we can obtain a relative average and 95th percentile 2 work.
20%
nunki
latency of a KnightShift system compared to a baseline system.
1
Figure 4: Ensemble- KnightShift Implementation
For very bursty workloads with
scf
0%high utilization (msg-mx, nunki),
wikibench
Modeling Single-threaded Performance: We vary the queue- 0 we experience the most latency-20%
impact, as expected. KnightShift does
0 handle
10 20 scenarios
30 40 50
60 the workload
0 10switches
20 30 quickly
40 50 between
60
ing model’s service time to model the performance difference of the
not
where
Wakeup Transition Time (s)
Wakeup Transition Time (s)
120%
0.6
Knight and primary server. We cannot infer single-threaded perforvery low and high utilization. In these scenarios, the workload may
LOW(<50)
10: Latency
energy sensitivity
to KnightShift transition time
mance directly from processor frequency because single-threaded Figurebenefit
fromand
a higher
capacityanalysis
Knight.
100%
MID(50-75)
performance is based on frequency and the underlying architecture.
Almost all workloads experience energy saving benefits from
0.4
HIGH(75+)
80%
Instead, we compare the 95th percentile latency of the Knight and
REFERENCES KnightShift with the exception
of workloads with mostly high utiprimary server and scale the service time accordingly. For example,
lization periods. There are no benefits from using KnightShift for
60%
0.2
Ryckbosch, S.that
Polfliet,
and L.mostly
Eeckhout,
in Server
Energy
our primary server has tail latency of 249ms while our Knight has tail [1] F. workloads
Actual
operate
at “Trends
utilization
above
the Proportionality,”
capability of
Computer, 2011.
40%
latency of 323ms as shown in section 7.1.2. As we do not have direct
Linear
the Knight, hence such workloads don’t need KnighShift support to
[2] Barroso and U. Holzle,“The Case For Energy-proportional Computing,” Computer, Dec
Ideal
access to the datacenter servers, nor can we replicate the proprietary 2007.begin with. For these cases, this may even lead to an energy penalty
20%
0
KnightShift
applications
on our
Knight,80%
we cannot
collect response times for the
(msg-mmp) due to running the Knight alongside a heavily utilized
0%
20%
40%
60%
100%
0%
primary server and Knight for each individual workload. Therefore,
primary server. ACKNOWLEDGEMENTS 0% 20% 40% 60% 80% 100%
-0.2
in our model, we assume that all workloads experience similar perFor most other workloads (aludra, email, girtab, scf, wikibench),
Utilization
Utilization
formance slowdown due to the Knight similar to WikiBench, where Thiswe
canwas
experience
an average
of 75%
work
supported
by DARPA
andenergy
NSF. savings with tail latency
Figure 6: KnightShift-enhanced Energy Proportionality Curve w/ 50% capable Knight
Figure 7: Proportionality Gap w/ KnightShift. Low-util PG effectively closed
the service time is increased by a factor of 1.3 compared to baseline.
within 9% of the Atom-based server.
Simulator Validation: We validated our trace-based emulation
Sensitivity to Knight Capability: Figure 10 shows the effect
by collecting utilization traces from our WikiBench run and replayed
of Knight capability levels on energy savings and 95th percentile
the utilization traces through the trace emulator. In addition,
response time. As Knight capability increases up to 50%, so does
Proportionatliy Gap
80%
Peak power
Peak power
Dec-11
MID(50-75)
0.6
0.8
"+LD"
"-LD"
0
Nov-07
LOW(<50)
100%
Peak power
Time
Jul-10
0.8
0%
MEASURING ENERGY PROPORTIONALITY 1
ssj_ops/watt
Dynamic Range
1
Proportionatliy Gap
Server energy proportionality has been improving over the past several
years. Many components in a system, such as CPU, have been achieving
good energy proportionality behavior. Using a wide range of server power
data from the published SPECpower data we show that the overall system
energy proportionality has reached 80%. We present two novel metrics,
linear deviation and proportionality gap, that provide insights into
accurately quantifying energy proportionality. Using these metrics we
show that energy proportionality improvements are not uniform across
various server utilization levels. In particular, the energy proportionality of
even a highly proportional server suffers significantly at non-zero but low
utilizations. We propose to tackle the lack of energy proportionality at low
utilization using server-level heterogeneity. We present KnightShift, a
server-level heterogeneous server architecture that introduces an active
low power mode, through the addition of a tightly-coupled compute node
called the Knight, enabling two energy-efficient operating regions. We
evaluated KnightShift against a variety of real-world datacenter workloads
using a combination of prototyping and simulation, showing up to 75%
energy savings with tail latency bounded by the latency of the Knight and
up to 14% improvement to Performance per TCO dollar spent.
• 
• 
• 
• 
• 
{wongdani,annavara}@usc.edu Energy and EP
Improvements
ABSTRACT LD Improvements
Daniel Wong, Murali Annavaram
University of Southern California