Evaluation of Data Placement Method in Database Run-Time Processing Considering Energy Saving and Application Performance Naho IIMURA† Miyuki NAKANO†† Norifumi NISHIKAWA‡ Masato OGUCHI† †Ochanomizu University ‡ Hitachi, Ltd., Yokohama Research Laboratory †† Shibaura Institute of Technology NOVEMBER 3rd, 2014 IGCC14 GPCDP Workshop OUTLINE • Introduction • Previous Work • Our Proposed Method and Evaluation Plan – Data Placement Control • Experimental • Evaluation Results and Discussion • Conclusion and Future Works November 3rd, 2014 IGCC14 GPCDP Workshop 2 Background(1/2) • The amount of digital data is increased rapidly. • The scale of datacenters(DC) has become larger. → The management use cost of DC has become larger. •Operating costs of machines •Amortization of buildings and cooling equipment •Electricity charge Energy saving of DC is drawing attention NOW November 3rd, 2014 IGCC14 GPCDP Workshop 3 Background(2/2) Typical Data Center Energy Consumption Rate 5% 5% Cooling Servers 13% Storages 14% 63% Network Hardware Power Conversion Reducing the power consumption of storage is an efficient way to save energy in datacenters November 3rd, 2014 IGCC14 GPCDP Workshop 4 Research Objective • Conventional Methods have been already addressed – Performance enhancement of cooling facilities – Performance enhancement of power efficiency and more... • Our Proposed Method – The power saving of storage by efficient management of data November 3rd, 2014 IGCC14 GPCDP Workshop 5 Goal of Research Reducing Energy Consumption of Storage while Minimizing the Deterioration of Database Application Performance TPC-H is one of the standard benchmark tool of database application To achieve … • Analyze of Power Consumption and System Performance of the TPC-H Runtime • Propose Storage Power Saving Method November 3rd, 2014 IGCC14 GPCDP Workshop 6 Previous Works Energy Efficient Storage Management Cooperated with Large Data Intensive Applications Nishikawa ,et al. (IEEE ICDE 2012) • Combine application level I/O and device level I/O • Extract Logical I/O pattern • Select appropriate power saving methods Applications DB Engine DB Buffer File System Buffer Device Driver Storage Buffer Storage Devices November 3rd, 2014 Runtime Power Saving Framework Application level I/O Monitor Buffer based Power Saving Methods • Pre-load • Write Delay Storage Device level I/O Monitor Control MAID • Power ON/OFF IGCC14 GPCDP Workshop 7 Present Work Direction • Experiment Environment – Large scale Unit → Single Unit To analyze power saving more detail in more small unit • For the storage power saving of TPC-H runtime – Calculate the Break-Even Time – Evaluate our proposed method focusing on the Service Level Agreement (SLA) • Data Placement Control November 3rd, 2014 IGCC14 GPCDP Workshop 8 Our Proposed Method • Data Placement Control – Based on I/O frequency on Run-Time applications, Modify the data placement Long I/O interval *The using frequency of data < < Change to the Standby-mode when disk is not being used Finally, power consumption can be reduced more November 3rd, 2014 IGCC14 GPCDP Workshop 9 Evaluation Plan Investigate the I/O frequency of data during runtime processing of TPC-H queries Based on I/O frequency, modify Data Placement The number of used HDDs are 3 - 10 Compare the performance and energy consumption during runtime processing of TPC-H queries between WITH and WITHOUT data placement control Evaluation Environment CPU Memory HDD 4 cores * 2 OS Cent OS 5.10 64bit DBMS HITACHI HiRDB Single Server Version 9 Bench mark Power Meter November 3rd, 2014 Univ. of Tokyo 8GBytes 3TB *11 PC to Control Power Meter Power Meter Remote Server Local PC Ochanomizu Univ. TPC-H (SF=10) YOKOGAWA Digital Power Meter IGCC14 GPCDP Workshop 11 Break-Even Time • Break-Even Time is the amount of time to continue the Standby state that satisfies the following condition. • The amount of energy needed for the spinup or spindown of the disk is equal to that of the energy saved by remaining in the Standby state during Break-Even Time. 24 seconds To reduce power consumption by using the Standby state, an I/O interval of approximately 24 seconds or more is needed. November 3rd, 2014 IGCC14 GPCDP Workshop 12 The Investigation of I/O Frequency • Investigate the I/O frequency of data, tables and indexes of during runtime processing of TPC-H queries. – Divide LINE ITEM Table and Indexes into 10 Buffers. – I/O interval is obtained every second. – The survey period is from the beginning to the end of the query execution. – Focus on the actual number of times the READs from the obtained data items. November 3rd, 2014 IGCC14 GPCDP Workshop 13 Classified Data From I/O Frequency • Classified the data into two types: – the data that HAVE actual number of instances of READ or NOT • In next page, I/O frequency of partitioned data is focused November 3rd, 2014 IGCC14 GPCDP Workshop 14 I/O Frequency of Partitioned Data •Partitioned tables are used in a numerical order •A longer I/O intervals is obtainable by placing partitioned data with near numbers on the same disk November 3rd, 2014 IGCC14 GPCDP Workshop 15 First Experiment Data placement control with table partitioning • Divide LINE ITEM Table and Indexes into 10 buffers – To use more flexible arrangement of data – Use the Hash Partitioning • Place all data on 3 HDDs – Design 2 patterns of placement about partitioned data. • Compare 2 patterns of placement during runtime processing of TPC-H – Times of I/O Interval – Power Consumption and Response Time November 3rd, 2014 IGCC14 GPCDP Workshop 16 Two Patterns of Data Placement A) Partitioned data is placed by round-robin placement 1,4, 7,10 2,5,8 3,6,9 B) Partitioned data with near numbers are placed on the same HDDs 1,2,3 4,5,6 November 3rd, 2014 7,8 IGCC14 GPCDP Workshop 9,10 17 Number and Times of I/O Intervals A) Round-Robin Placement B) Near-Number Placement ~24sec(Break-Even Time) 111 12 25~100sec 59 35 9 14 101~200sec Much More Times 8 201sec~ Get longer I/O intervals 14 • Pattern A get more times of short (less than the Break-Even Time) I/O interval. • Placement of partitioned data by NearNumber Placement is efficient in this case. November 3rd, 2014 IGCC14 GPCDP Workshop 18 First Experiment Result Power Consumption Response Time Without Standby state With Standby state [J] [mm:ss] Without Standby state With Standby state •Reduce the Power Consumption120:00 more in B105:52 98,880 98,789 because I/O96:00 interval is longer than92:43 A. 100,000 82:58 84,244 80,000 •Delay rate of Response Time of B72:00 is smaller than A 85:13 65,972 60,000 because the48:00 seek overhead is smaller. 40,000 The placement partitioned data by Near-Number Placement is 24:00 20,000 efficient in this case. 0 0:00 120,000 (1) A (2) B (1)A A) Round-Robin Placement (2) B B) Near-Number Placement Reduction Rate of Power Consumption 15% 33% Delay Rate of Response Time 22% 8% November 3rd, 2014 IGCC14 GPCDP Workshop 19 Second Experiment Data placement control with using 10 HDDs • The number of HDD is 10 • Compare during runtime processing of TPC-H queries between With and Without Data Placement Control – Power Consumption and Response Time November 3rd, 2014 IGCC14 GPCDP Workshop 20 Second Experiment Data Placemet(1/2) • Without Control – Placed all data such that the amount of data in each HDD to be evenly. – The frequency of the data I/O is not considered in this case. HDD1 HDD2 HDD3 HDD6 LINEITEM_1 LINEITEM_Index_1 the same partitioned number of data are placed on the same HDD November 3rd, 2014 HDD7 HDD4 HDD8 HDD5 Energy State of All Disks is Idle or Active HDD9 HDD10 The amount of data in each HDD 7% 7% HDD1 7% 21% HDD3 7% HDD5 16% 7% HDD7 HDD9 8% 8% 12% IGCC14 GPCDP Workshop HDD2 HDD4 HDD6 HDD8 HDD10 21 Second Experiment Data Placemet(2/2) • With Control – HDD1: Placed the data that have I/O – HDD2: Placed the data that have no I/O – HDD3-10: No data is placed Idle or Active Standby HDD1 HDD2 Have I/O No I/O HDD3 HDD7 Biased November 3rd, 2014 HDD4 HDD5 HDD6 HDD8 HDD9 HDD10 No data IGCC14 GPCDP Workshop 22 Second Experiment Result Power Consumption [J] 400,000 332,649 Response Time Without control [mm:ss] Without Control With control 120:00 With Control 96:00 300,000 0 96:24 72:00 200,000 100,000 92:04 93,070 48:00 24:00 0:00 •Reduce the Power Consumption 72% •Delay of Response Time is 4% •This Result is reasonable because only one of HDD has the data that have I/O, and the power consumption state is Idle or Active. November 3rd, 2014 IGCC14 GPCDP Workshop 23 Conclusion • Proposed data placement control method in Database Run-Time Processing – Based on I/O frequency, modify the data placement – Consider energy saving and application performance • Evaluate our proposed method with TPC-H – Found the data placement control method is effective for energy saving during runtime application processing November 3rd, 2014 IGCC14 GPCDP Workshop 24 Future Works • Examination of more detailed data placement • Investigation the relation of Trade-off between power consumption and response time November 3rd, 2014 IGCC14 GPCDP Workshop 25 Acknowledgement • Thank for the conscientious advice with this work – Institute of Industrial Science, the University of Tokyo • Associate Prof. Daisaku Yokoyama – Kogakuin University • Associate Prof. Saneyasu Yamaguchi – Institute of Information Security • Prof. Atsuhiro Goto – Shibaura Institute of Technology • Associate Prof. Midori Sugaya • This work is partly supported by the Ministry of Education, Culture, Sports, Science and Technology November 3rd, 2014 IGCC14 GPCDP Workshop 26 END Thank you for your kind attention. November 3rd, 2014 IGCC14 GPCDP Workshop 27
© Copyright 2026 Paperzz