Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31 Goals of Big Data Systems Larger Faster Greener BPOE 2013 | HPCChina 2013 Performance V.S. Energy Efficiency Performance Energy Efficiency Faster & More Powerful Tradeoff More servers Bigger clusters Powerful processors Sophisticated processing algorithms Evaluation … BPOE 2013 | HPCChina 2013 Greener & Cheaper Lightweight servers Efficient processors Simpler processing algorithms … Evaluation of Performance & Energy Efficiency Tradeoff How to measure? AxPUE: Application Level Metrics for Power Usage Effectiveness in Big Data Systems How to get balance? The Implications from Benchmarking Three Big Data Systems BPOE 2013 | HPCChina 2013 Motivation If you can not measure it, you can not improve it. – Lord Kelvin PUE(Power usage effectiveness): a measure of how efficiently a computer data center uses its power; specifically, how much of the power is actually used by the information technology equipment. BPOE 2013 | HPCChina 2013 PUE & Its Variants Metric Time Organization PUE 2007 GreenGrid DCiE DCeP 2008 2008 Computing Formulas Total Facility Energy IT Equipment Energy IT Equipment Energy *100% Total Facility Energy GreenGrid GreenGrid pPUE 2012 GreenGrid PUE Scalability 2013 GreenGrid UsefulWork Pr oduced Total Quantity of ResourceConsumed Producing this Work BPOE 2013 | HPCChina 2013 Total Facility Energy insidethe Boundary IT Equipment Energy insidethe Boundary mActual *100% mPUE Motivation • Scenario1 An Improved Data Classification Algorithm Does it contribute to greening the data centers? Run the Algorithms on Data Center Compare the PUEs Data Management PUE can Researcher not measure the effectiveness of any changes made upon the data center infrastructure! No Obvious Variations! BPOE 2013 | HPCChina 2013 Motivation • Scenario2 Give a budget plan of the data center energy consumption in the next year Estimate the data volume based on the business development Data Center Administrators How to estimate the energy increasement? PUE provides little reference information for data center planning according to data scale and application complexity BPOE 2013 | HPCChina 2013 Calculation Framework AxPUE PUE BPOE 2013 | HPCChina 2013 Definition - ApPUE • ApPUE (Application Performance Power Usage Effectiveness): a metric that measures the power usage effectiveness of IT equipments, specifically, how much of the power entering IT equipments is used to improve the application performance. • Computation Formulas: Data processing performance of applications ApPUE Application Performance IT Equipment Power The average rate of IT Equipment Energy consumed BPOE 2013 | HPCChina 2013 Definition - AoPUE • AoPUE (Application Overall Power Usage Effectiveness ): a metric that measures the power usage effectiveness of the overall data center system, specifically, how much of the total facility power is used to improve the application performance. • Computation Formulas: AoPUE Application Performance Total Facility Power ApPUE AoPUE PUE The average rate of Total Facility Energy Used BPOE 2013 | HPCChina 2013 Acquisition – Application Performance Application Category Examples Metric Service Application Search engine, Ad-hoc queries Number of requests answered in unit time Data Analysis Application Data mining, Reporting, Decision support, Log analysis Volume of data processed in unit time Interactive Real-time Application E-commerce, Profile data management Number of transactions completed in unit time High Performance Computing Scientific Computing Number of floating-point operations in unit time BPOE 2013 | HPCChina 2013 Acquisition – Benchmark • Requirements of Benchmarks – Provide representative workloads for big data applications – Provide a scalable data generation tool • BigDataBench – A big data benchmark suite open-sourced recently and publicly available – All the requirements are well fullfilled BPOE 2013 | HPCChina 2013 Experiment Overview • Testbed – Data center of 18 racks,362 servers – Sample 8 servers • Workloads • Two experiments – Different Applications – Different Implementation Algorithms BPOE 2013 | HPCChina 2013 Experiments on Different Applications 17.2 11.5 269.9 179.7 PUE ApPUE AoPUE BigDataBench SVM Sort Grep BPOE 2013 | HPCChina 2013 Linpack Experiments on Different Algorithms • Two Implementations for Sort – Several reducers with random sampling partitioning – One reducer without partitioning PUE(Sort1) ApPUE(Sort1) PUE(Sort2) ApPUE(Sort2) Data Size BPOE 2013 | HPCChina 2013 Conclusions • We analyze the requirements of application-level energy effectiveness metrics AxPUE in data centers. • We propose two novel application-level metrics ApPUE and AoPUE to measure the energy consumed to improve the application performance. • The experiment results show that AxPUE could provide meaningful guidance to data center design and optimization. BPOE 2013 | HPCChina 2013 Evaluation of Performance & Energy Efficiency Tradeoff How to measure? AxPUE: Application Level Metrics for Power Usage Effectiveness in Data Centers How to get balance? The Implications from Benchmarking Three Big Data Systems BPOE 2013 | HPCChina 2013 New Solutions …… BPOE 2013 | HPCChina 2013 Experimental Platforms Xeon (Common processor) Atom ( Low power processor) Tilera Brief Comparison (ManyBasic coreInformation processor) CPU Type Intel Xeon E5310 Intel Atom D510 Tilera TilePro36 CPU Core 4 cores @ 1.6GHz 2 cores @ 1.66GHz 36 cores @ 500MHz L1 I/D Cache 32KB 24KB 16KB/8KB L2 Cache 4096KB 512KB 64KB BPOE 2013 | HPCChina 2013 Benchmark Selection BigDataBench A big data benchmark suite from big data applications Respective applications An innovative data generation tool Application Time Complexity Characteristics Sort O(n*log2n) Integer comparison WordCount O(n) Integer comparison and calculation Grep O(n) String comparison Naïve Bayes O(m*n) Floating-point computation SVM O(n3) Floating-point computation BPOE 2013 | HPCChina 2013 Metrics Performance: Data processed per second (DPS) Energy Efficiency: Application Performance Power Usage Effectiveness(DPJ) BPOE 2013 | HPCChina 2013 General Observations Xeon Atom DPS DPJ BPOE 2013 | HPCChina 2013 Tilera General Observations Xeon Atom Tilera Data scale has a significant impact on the performance and energy efficiency of big data systems. The performance and energy efficiency trends of different applications are diverse. BPOE 2013 | HPCChina 2013 Xeon VS Atom – DPS BPOE 2013 | HPCChina 2013 Xeon VS Atom – DPJ BPOE 2013 | HPCChina 2013 Xeon VS Atom – DPS & DPJ 500MB 1GB 10GB 25GB 50GB 100G B Sort DPS DPJ 3.67 0.87 4.51 1.08 1.89 0.45 1.54 0.36 1.36 0.32 1.40 0.33 Wordcount DPS DPJ 2.27 0.55 2.38 0.58 2.74 0.61 2.84 0.61 2.82 0.62 2.79 0.60 Grep DPS DPJ 1.83 0.48 1.82 0.46 2.30 0.54 2.79 0.62 2.87 0.63 2.89 0.64 Naïve Bayes DPS DPJ 3.83 0.89 3.89 0.87 4.52 1.01 4.64 0.99 4.54 0.97 4.58 0.90 SVM DPS DPJ 3.19 0.69 3.06 0.64 3.17 0.66 3.14 0.67 Xeon is more powerful than Atom on processing capacity. Atom is more energy –saving than Xeon when dealing with simple computation logic applications. BPOE 2013 | HPCChina 2013 Xeon VS Atom -- Summary Xeon is more powerful than Atom on processing capacity. Atom is energy conservation than Xeon when dealing with applications with simple computation logic. Atom doesn’t show energy advantage when dealing with complex applications. BPOE 2013 | HPCChina 2013 Xeon VS Tilera – DPS BPOE 2013 | HPCChina 2013 Xeon VS Tilera – DPJ BPOE 2013 | HPCChina 2013 Xeon VS Tilera – DPS & DPJ 500MB 1GB 10GB 25GB Sort DPS DPJ 3.67 0.48 3.39 0.45 2.41 0.31 2.60 0.34 Wordcount DPS DPJ 5.19 0.67 5.04 0.65 7.35 0.87 7.78 0.92 Grep DPS DPJ 3.60 0.51 3.52 0.48 7.45 0.94 9.93 1.21 Naïve Bayes DPS DPJ 5.91 0.75 5.78 0.70 7.59 0.89 7.94 0.92 Xeon is more powerful than Tilera on processing capacity Tilera is more energy-saving than Xeon when dealing with the simple computation logic and I/O intensive applications Tilera don’t show energy advantage when dealing with complex applications BPOE 2013 | HPCChina 2013 Xeon VS Tilera DPSThe of Atom DPS of Tilera The DPS of The Xeon BPOE 2013 | HPCChina 2013 Xeon VS Tilera Tilera is more suitable to process I/O intensive applications The DPS of Tilera BPOE 2013 | HPCChina 2013 Xeon VS Tilera -- Summary Xeon is more powerful than Tilera on processing capacity. Tilera is more energy conservation than Xeon when dealing with simple computation logic and I/O intensive applications. Tilera don’t show energy advantage when dealing with complex applications. Tilera is more suitable to process I/O intensive applications. BPOE 2013 | HPCChina 2013 36 Implications The performance of a big data system is not only related to the hardware itself, but also the application type and data volume of workloads. The weak processors aren’t suitable to deal with complex applications. Even they have lower TDP, they don’t show energy cost advantage. BPOE 2013 | HPCChina 2013 Implications Cont. Xeon generally has better processing capacity accompanied with high energy consumption, especially to some light scale-out applications. Atom and Tilera show energy consumption advantage when dealing with light scale-out applications. Tilera exerts energy advantage on processing I/O intensive application. BPOE 2013 | HPCChina 2013 BPOE 2013 | HPCChina 2013
© Copyright 2026 Paperzz