Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar Keith Farkas Norman P Jouppi ,Partha Ranganathan ,Dean M.Tullsen University of California, San Diego MICRO 2003 Speaker:Chun-Chung Chen Outline • • • • • Introduction Motivation Architecture Experiment Results Conclusion 2 Motivation • By 2015 processors will consume 300W • Existing CMP designs use only homogeneous cores • Applications with high ILP can be exploited on wider cores but applications with low ILP use less power on narrower cores with little loss in performance • No need to design cores from scratch because existing Alpha cores run on practically the same ISA General Idea • Single-ISA heterogeneous multi-core architecture – A mechanism to reduce power dissipation • System software dynamically choose the most power efficient processor under some performance constraints Power efficiency Modeling of CPU Cores • EV4: Alpha 21064 • EV6: Alpha 21264 EV5: Alpha 21164 EV8-: single-threaded version of Alpha 21464 • Assumption • Only one application runs at a time on only one core • Unused cores are completely powered down (therefore no leakage) Cores, cont. • Assuming all cores are implemented in 0.10 micron technology • We assume the four cores have private L1 data and instruction caches and share a common L2 cache, phaselock loop circuitry, and pins. • All cores run at 2.1GHz. • ISA differences solved by. Either programs are compiled to the least common denominator (the EV4), or we use software traps for the older cores. Modeling of Power 7 Core Switching • Switching done at the operating system level • OS switch involves cache flush and saving and loading user states for the cores • Estimate that a core can be powered up in ~1000 cycles at 2.1 GHz • Switching overhead turns out to be negligible (~1%) Variation in Power & Performance • Benchmark, applu • Relative performance of the cores varies between phases. 9 Switching Algorithms: Oracle based dynamic switching using energy heuristic • With oracle knowledge of power requirements and performance potential, chose the core that would have the lowest energy consumption, as long as it performs within 10% of EV8- applu Switching Algorithms: Oracle based dynamic switching using energy-delay heuristic • • Oracle chooses the core that has the lowest energy–delay product. Choose the core that would maximize IPS2/Watt, as long as it performs within 50% of EV8- applu Switching Algorithms: Realistic Dynamic Switching • Every 100 time intervals, one or more cores are sampled for five intervals each. • Neighbor – One of the neighboring cores is chosen at random to be sampled • Neighbor-global – Similar to neighbor, except selecting the accumulated energydelay product. • Random – A core is chosen at random to be sampled • All – All cores are sampled Realistic Dynamic Switching Results • Results shown normalized to EV8- performance • Performance degradation of realistic schemes is less than in oracle-based schemes • Realistic schemes resulted in more core switching Conclusion • Realistic dynamic switching algorithms show a decrease in energy and energy-delay with only a small decrease in performance. • Single ISA heterogeneous multi-core processors using existing technology may be a way to curb power usage.
© Copyright 2026 Paperzz