Slides

Evaluation of Data Placement Method
in Database Run-Time Processing
Considering Energy Saving and
Application Performance
Naho IIMURA†
Miyuki NAKANO††
Norifumi NISHIKAWA‡
Masato OGUCHI†
†Ochanomizu University
‡ Hitachi, Ltd., Yokohama Research Laboratory
†† Shibaura Institute of Technology
NOVEMBER 3rd, 2014
IGCC14 GPCDP Workshop
OUTLINE
• Introduction
• Previous Work
• Our Proposed Method and Evaluation Plan
– Data Placement Control
• Experimental
• Evaluation Results and Discussion
• Conclusion and Future Works
November 3rd, 2014
IGCC14 GPCDP Workshop
2
Background(1/2)
• The amount of digital data is increased rapidly.
• The scale of datacenters(DC) has become
larger.
→ The management use cost of DC has become
larger. •Operating costs of machines
•Amortization of buildings and cooling equipment
•Electricity charge
Energy saving of DC is
drawing attention NOW
November 3rd, 2014
IGCC14 GPCDP Workshop
3
Background(2/2)
Typical Data Center Energy Consumption Rate
5% 5%
Cooling
Servers
13%
Storages
14%
63%
Network Hardware
Power Conversion
Reducing the power consumption of storage is
an efficient way to save energy in datacenters
November 3rd, 2014
IGCC14 GPCDP Workshop
4
Research Objective
• Conventional Methods
have been already
addressed
– Performance enhancement of cooling facilities
– Performance enhancement of power efficiency
and more...
• Our Proposed Method
– The power saving of storage
by efficient management of data
November 3rd, 2014
IGCC14 GPCDP Workshop
5
Goal of Research
Reducing Energy Consumption of Storage
while Minimizing the Deterioration
of Database Application Performance
TPC-H is one of the standard benchmark tool
of database application
To achieve …
• Analyze of Power Consumption and System
Performance of the TPC-H Runtime
• Propose Storage Power Saving Method
November 3rd, 2014
IGCC14 GPCDP Workshop
6
Previous Works
Energy Efficient Storage Management Cooperated with Large Data Intensive
Applications
Nishikawa ,et al. (IEEE ICDE 2012)
• Combine application level I/O and device level I/O
• Extract Logical I/O pattern
• Select appropriate power saving methods
Applications
DB Engine
DB Buffer
File System Buffer
Device Driver
Storage Buffer
Storage Devices
November 3rd, 2014
Runtime Power Saving Framework
Application level I/O
Monitor
Buffer based Power Saving
Methods
• Pre-load
• Write Delay
Storage Device level I/O
Monitor
Control MAID
• Power ON/OFF
IGCC14 GPCDP Workshop
7
Present Work Direction
• Experiment Environment
– Large scale Unit → Single Unit
To analyze power
saving more detail in
more small unit
• For the storage power saving of TPC-H runtime
– Calculate the Break-Even Time
– Evaluate our proposed method focusing on the
Service Level Agreement (SLA)
• Data Placement Control
November 3rd, 2014
IGCC14 GPCDP Workshop
8
Our Proposed Method
• Data Placement Control
– Based on I/O frequency on Run-Time applications, Modify
the data placement
Long I/O
interval
*The using frequency of data
<
<
Change to the Standby-mode when disk is not being used
Finally, power consumption can be reduced more
November 3rd, 2014
IGCC14 GPCDP Workshop
9
Evaluation Plan
Investigate the I/O frequency of data during runtime
processing of TPC-H queries
Based on I/O frequency, modify Data Placement
The number of used
HDDs are 3 - 10
Compare the performance and energy consumption during
runtime processing of TPC-H queries between WITH and
WITHOUT data placement control
Evaluation Environment
CPU
Memory
HDD
4 cores * 2
OS
Cent OS 5.10
64bit
DBMS
HITACHI HiRDB
Single Server
Version 9
Bench
mark
Power
Meter
November 3rd, 2014
Univ. of Tokyo
8GBytes
3TB *11
PC to
Control
Power
Meter
Power
Meter
Remote
Server
Local PC
Ochanomizu Univ.
TPC-H
(SF=10)
YOKOGAWA
Digital Power
Meter
IGCC14 GPCDP Workshop
11
Break-Even Time
• Break-Even Time is the amount of time to continue the
Standby state that satisfies the following condition.
• The amount of energy needed for the spinup or spindown
of the disk is equal to that of the energy saved by remaining
in the Standby state during Break-Even Time.
24 seconds
To reduce power consumption by using the Standby state,
an I/O interval of approximately 24 seconds or more is needed.
November 3rd, 2014
IGCC14 GPCDP Workshop
12
The Investigation of I/O Frequency
• Investigate the I/O frequency of data, tables
and indexes of during runtime processing of
TPC-H queries.
– Divide LINE ITEM Table and Indexes into 10 Buffers.
– I/O interval is obtained every second.
– The survey period is from the beginning to the end
of the query execution.
– Focus on the actual number of times the READs
from the obtained data items.
November 3rd, 2014
IGCC14 GPCDP Workshop
13
Classified Data From I/O Frequency
• Classified the data into two types:
– the data that HAVE actual number of instances of
READ or NOT
• In next page, I/O frequency of partitioned
data is focused
November 3rd, 2014
IGCC14 GPCDP Workshop
14
I/O Frequency of Partitioned Data
•Partitioned tables are used in a numerical order
•A longer I/O intervals is obtainable by placing partitioned data
with near numbers on the same disk
November 3rd, 2014
IGCC14 GPCDP Workshop
15
First Experiment
Data placement control with table partitioning
• Divide LINE ITEM Table and Indexes into 10
buffers
– To use more flexible arrangement of data
– Use the Hash Partitioning
• Place all data on 3 HDDs
– Design 2 patterns of placement about partitioned
data.
• Compare 2 patterns of placement during runtime
processing of TPC-H
– Times of I/O Interval
– Power Consumption and Response Time
November 3rd, 2014
IGCC14 GPCDP Workshop
16
Two Patterns of Data Placement
A) Partitioned data is placed by round-robin placement
1,4,
7,10
2,5,8
3,6,9
B) Partitioned data with near numbers are placed on the same
HDDs
1,2,3
4,5,6
November 3rd, 2014
7,8
IGCC14 GPCDP Workshop
9,10
17
Number and Times of I/O Intervals
A) Round-Robin Placement
B) Near-Number Placement
~24sec(Break-Even Time)
111
12
25~100sec
59
35
9
14
101~200sec
Much More Times
8
201sec~
Get longer I/O
intervals
14
• Pattern A get more times of short (less than
the Break-Even Time) I/O interval.
• Placement of partitioned data by NearNumber Placement is efficient in this case.
November 3rd, 2014
IGCC14 GPCDP Workshop
18
First Experiment
Result
Power Consumption
Response Time
Without Standby state
With Standby state
[J]
[mm:ss]
Without Standby state
With Standby state
•Reduce the Power Consumption120:00
more in B105:52
98,880
98,789
because I/O96:00
interval
is longer than92:43
A.
100,000
82:58
84,244
80,000
•Delay rate of Response
Time of B72:00
is smaller than A 85:13
65,972
60,000
because the48:00
seek overhead is smaller.
40,000
The placement partitioned data by
Near-Number Placement is
24:00
20,000
efficient
in this case.
0
0:00
120,000
(1)
A
(2)
B
(1)A
A) Round-Robin Placement
(2)
B
B) Near-Number Placement
Reduction Rate of Power
Consumption
15%
33%
Delay Rate of Response
Time
22%
8%
November 3rd, 2014
IGCC14 GPCDP Workshop
19
Second Experiment
Data placement control with using 10 HDDs
• The number of HDD is 10
• Compare during runtime processing of TPC-H
queries between With and Without Data
Placement Control
– Power Consumption and Response Time
November 3rd, 2014
IGCC14 GPCDP Workshop
20
Second Experiment
Data Placemet(1/2)
• Without Control
– Placed all data such that the amount of data in each HDD
to be evenly.
– The frequency of the data I/O is not considered in this
case.
HDD1
HDD2
HDD3
HDD6
LINEITEM_1
LINEITEM_Index_1
the same
partitioned number
of data are placed
on the same HDD
November 3rd, 2014
HDD7
HDD4
HDD8
HDD5
Energy State of All
Disks is
Idle or Active
HDD9
HDD10
The amount of data in each HDD
7%
7%
HDD1
7%
21%
HDD3
7%
HDD5
16%
7%
HDD7
HDD9
8%
8%
12%
IGCC14 GPCDP Workshop
HDD2
HDD4
HDD6
HDD8
HDD10
21
Second Experiment
Data Placemet(2/2)
• With Control
– HDD1: Placed the data that have I/O
– HDD2: Placed the data that have no I/O
– HDD3-10: No data is placed
Idle or
Active
Standby
HDD1
HDD2
Have I/O
No I/O
HDD3
HDD7
Biased
November 3rd, 2014
HDD4
HDD5
HDD6
HDD8
HDD9
HDD10
No data
IGCC14 GPCDP Workshop
22
Second Experiment
Result
Power Consumption
[J]
400,000
332,649
Response Time
Without control
[mm:ss]
Without Control
With control
120:00
With Control
96:00
300,000
0
96:24
72:00
200,000
100,000
92:04
93,070
48:00
24:00
0:00
•Reduce the Power Consumption 72%
•Delay of Response Time is 4%
•This Result is reasonable because only one of HDD has the data
that have I/O, and the power consumption state is Idle or Active.
November 3rd, 2014
IGCC14 GPCDP Workshop
23
Conclusion
• Proposed data placement control method in
Database Run-Time Processing
– Based on I/O frequency, modify the data
placement
– Consider energy saving and application
performance
• Evaluate our proposed method with TPC-H
– Found the data placement control method is
effective for energy saving during runtime
application processing
November 3rd, 2014
IGCC14 GPCDP Workshop
24
Future Works
• Examination of more detailed data placement
• Investigation the relation of Trade-off between
power consumption and response time
November 3rd, 2014
IGCC14 GPCDP Workshop
25
Acknowledgement
• Thank for the conscientious advice with this work
– Institute of Industrial Science, the University of Tokyo
• Associate Prof. Daisaku Yokoyama
– Kogakuin University
• Associate Prof. Saneyasu Yamaguchi
– Institute of Information Security
• Prof. Atsuhiro Goto
– Shibaura Institute of Technology
• Associate Prof. Midori Sugaya
• This work is partly supported by the Ministry of
Education, Culture, Sports, Science and Technology
November 3rd, 2014
IGCC14 GPCDP Workshop
26
END
Thank you for your kind attention.
November 3rd, 2014
IGCC14 GPCDP Workshop
27