The Laboratory for Computer Architecture at Virginia (LAVA) Kevin Skadron University of Virginia Department of Computer Science Page 1 Why We Care About Thermal Management... Source: Tom’s Hardware Guide http://www6.tomshardware.com/cpu/01q3/010917/heatvideo-01.html Page 2 Dynamic Thermal Management Dynamically adjust execution to control temperature Avoid catastrophic failure (heat sink, fan) Permit the use of a less expensive thermal package Design for less than the worst case Package costs ~$1 / W above ~40 W Peak power as high as 130 W in 1-2 generations (SIA roadmap) Temperatures over 100°C Page 3 Dynamic Thermal Management Deal with “hot spots” Localized heating occurs much faster than chip-wide Chip-wide treatment is too conservative Prove temperature will be safely bounded Page 4 Thermal Modeling Want a fine-grained model of temperature Power dissipation: too indirect, not easy to measure in HW Page 5 “Ohm’s Law” for Temperature V temp I power R thermal resistance C thermal capacitance RC time constant I · t V · t V = ------- + -------C RC Lets us compute stepwise changes in temperature for any granularity at which we can get P, T, R, C steady-state: V = IR (T = PR) Page 6 Thermal Modeling Use thermal resistance and capacitance of Si Develop computationally efficient model based on lumped values Pi · t Ti · t Ti = -------- + --------Ci RiCi Integrate in Wattch (power/performance simulator) Time evolution of temperature is driven by unit activities and power dissipations on a per-cycle basis Detect hot spots and activate thermal response Typical time constant: 10-100 s Page 7 Fetch Toggling Fetch toggling disable fetch every N cycles 4/5, 2/3, 1/2, 1/3, 1/5, … IF ID EX Page 8 MEM WB Fetch Toggling Fetch toggling disable fetch every N cycles 4/5, 2/3, 1/2, 1/3, 1/5, … IF ID EX MEM WB IF ID EX MEM WB Page 9 Fetch Toggling Fetch toggling disable fetch every N cycles 4/5, 2/3, 1/2, 1/3, 1/5, … IF ID EX MEM WB IF ID EX MEM WB How to set the fetch rate? Page 10 Feedback-Control of Fetch Toggling Formal feedback control setpoint e Controller measured T m Actuator: I-fetch toggling P Temp. sensor PID: m = KC (e + KIe + Kdde/dt) easy to compute toggling = f(m) Page 11 Thermal dynamics T Other Thermal-Management Techniques Fetch toggling Fetch throttling Decode throttling Speculation control Frequency/voltage scaling Page 12 Per-Structure Response Hot spots Branch predictor (probed every cycle) Load-store queue L1 D-cache (for high-BW apps) …most major structures are a hot spot for at least one SPEC2k app Modified Wattch Sampling rate: 1000 cycles (RC of hot spots is 10-100 s) Base temp. of 100C (SIA roadmap) Emergency threshold of 108 (Yuan/Hong SEMI-THERM ‘01) Set point of 107.9 Page 13 Thermal Modeling: Where to go from here? (i.e., lots of research questions) Floor-planning issues and granularity of lumped R/C values Thermal coupling among blocks Response lag in temperature sensors Validation techniques Visualization How to deal with large time scales? Page 14 Thermal Management: Where to go from here? (i.e., lots more research questions) New mechanisms Characterize benchmarks When to use frequency/voltage scaling Faster HW techniques for sensing temperature changes Robust response despite sensor lag Hot spots Temperature effects on leakage current Joint control of temp., power, and performance Page 15 Thermal Management: Where to go from here? (i.e., lots more research questions) New mechanisms When to use clock scaling Robust response despite sensor lag Temperature effects on leakage current Joint control of temperature, power, and performance Page 16 Summary New tools for thermal management Models Mechanisms Source: Tom’s Hardware Guide http://www6.tomshardware.com/cpu/01q3/010917/heatvideo-01.html Page 17 Backup slides Page 18 0% Page 19 Performance loss reduced by 65% 25% MEAN 30% bzip vortex perlbmk eon parser fma3d facerec crafty equake art mesa gcc Percent Loss in Performance Performance Loss toggle1 PID 20% 15% 10% 5%
© Copyright 2026 Paperzz