Universität Dortmund Embedded System Hardware Embedded system hardware is frequently used in a loop („hardware in a loop“): actuators P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 0- Universität Dortmund Summary Processing units Power efficiency of target technologies ASICs Processors • Energy efficiency • Code size efficiency and code compaction • Run-time efficiency • DSP processors • Multimedia processors • Very long instruction word (VLIW) machines • Micro-controllers Reconfigurable Hardware Memory P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 1- Universität Dortmund Memory For the memory, efficiency is again a concern: speed (latency and throughput); predictable timing energy efficiency size cost other attributes (volatile vs. persistent, etc) P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 2- Universität Dortmund Access times and energy consumption increases with the size of the memory Example (CACTI Model): "Currently, the size of some applications is doubling every 10 months" [STMicroelectronics, Medea+ Workshop, Stuttgart, Nov. 2003] P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 3- Universität Dortmund Access times and energy consumption for multi-ported register files Area (l2x106) Cycle Time (ns) Power (W) 1.8 7 14 1.7 6 12 5 10 4 8 3 6 1.2 2 4 1.1 1 2 1 0 0 1.6 1.5 1.3 16 32 64 Register File Size 128 16 32 GP6M2 64 128 GP6M3 16 32 128 64 Register File Size Rixner’s et al. model [HPCA’00], Technology of 0.18 mm P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 4- Source and © H. Valero, 2001 1.4 Universität Dortmund Reducing energy consumption with Sub-banking 30 1-Subbank 4-Subbanks 16-Subbanks 25 Energy [uJ] shorter wires, lower capacitances, faster access 2-Subbanks 8-Subbanks 20 15 10 5 Memory Size (bytes) http://www.ecse.rpi.edu/frisc/theses/ MaierThesis/FIG319.GIF P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 5- 2M 8k 16 k 32 k 64 k 12 8k 25 6k 51 2k 1M 4k 2k 1k 51 2 25 6 0 Universität Dortmund How much of the energy consumption of a system is memory-related? Mobile PC Thermal Design (TDP) System Power Other 13% Other 13% 600/500 MHz uP 37% Power Supply 10% 600/500 MHz uP 13% Power Supply 10% Memory+Graphics 12% HDD 9% Mobile PC Average System Power LCD 10" 30% Memory+Graphics 15% LCD 10" 19% HDD 19% Note: Based on Actual Measurements CPU Dominates Thermal Design Power [Courtesy: N. Dutt; Source: V. Tiwari] Multiple Platform Components Comprise Average Power P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 6- Universität Dortmund „CPU“ Power Dissipation EBOX 8% DMMU 8% Others 5% Icache 26% 42%/40% again memory-related ! Clock 10% IMMU 9% Ibox 18% Dcache 16% TLB 17% Data Flow 11% I/O 7% Clock 19% Strong ARM IEEE Journal of SSC Nov. 96 Control L. 16% PLA 5% ROM 2% Cache 23% Power PC Based on slide by and ©: Osman S. Unsal, Israel Koren, C. Mani Krishna, Csaba Andras Moritz, University of Massachusetts, Amherst, 2001 Proceedings of ISSCC 94 P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 7- Universität Dortmund Access-times will be a problem Speed gap between processor and main DRAM increases Speed • early 60ties (Atlas): page fault ~ 2500 instructions • 2002 (2 GHz µP): access to DRAM ~ 500 instructions penalty for cache miss about same as for page fault in Atlas 8 4 2x every 2 years 2 1 0 1 2 3 4 5 years [P. Machanik: Approaches to Addressing the Memory Wall, TR Nov. 2002, U. Brisbane] P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 8- Universität Dortmund Predictability is a problem (1) Embedded system systems are often real-time systems: Have to guarantee meeting timing constraints. Predictability: For satisfying timing constraints in hard realtime systems, predictability is the most important concern; pre run-time scheduling is often the only practical means of providing predictability in a complex system [Xu, Parnas] Time-triggered, statically scheduled operating systems P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 9- Universität Dortmund Predictability is a problem (2) Currently available caches don‘t solve the problem: • Improve the average case behavior • Use „non-deterministic“ cache replacement algorithms Scratch-pad/tightly coupled memory based predictability P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 10 - Universität Dortmund Hierarchical memories using scratch pad memories (SPM) Hierarchy Example main Address space SPM processor 0 no tag memory ARM7TDMI cores, well-known for low power consumption scratch pad memory FFF.. P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 11 - Universität Dortmund Comparison of currents using measurements E.g.: ATMEL board with ARM7TDMI and ext. SRAM Current 32 Bit-Load Instruction (Thumb) 200 mA 150 100 116 77,2 82,2 1,16 53,1 50 48,2 50,9 44,4 Prog Main/ Data Main Prog Main/ Data SPM Prog SPM/ Data Main 0 Core+SPM (mA) P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 Prog SPM/ Data SPM Main Memory Current (mA) - 12 - Universität Dortmund Comparison of energy consumption Example: Atmel ARM-Evaluation board Main memory access takes more cycles savings (86%) larger than for current. energy reduction: / 7.06 Energy 10 nJ 32 Bit-Load Instruction (Thumb) 140,0 120,0 100,0 80,0 60,0 40,0 20,0 0,0 115,8 76,5 51,6 16,4 Prog Main/ Data Main Prog Main/ Data SPM Prog SPM/ Data Main € Prog SPM/ Data SPM Energy P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 13 - Universität Dortmund Why not just use a cache ? 1. Predictability? Worst case execution time (WCET) may be large [P. Marwedel et al., ASPDAC, 2004] P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 14 - Universität Dortmund Why not just use a cache ? (2) 2. Energy for parallel access of sets, in comparators, muxes. . 9 8 Energy per access [nJ] 7 6 Scratch pad 5 Cache, 2way, 4GB space 4 Cache, 2way, 16 MB space Cache, 2way, 1 MB space 3 2 1 0 256 512 1024 2048 4096 8192 16384 memory size P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 [R. Banakar, S. Steinke, B.-S. Lee, 2001] - 15 - Universität Dortmund Influence of the associativity Parameters different from previous slides [P. Marwedel et al., ASPDAC, 2004] P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 16 - Universität Dortmund Flash Memory Based on EPROM/EEPROM-Memory EPROM storage cell Row (j) FG charged FG not charged Ground Read amplifier floating gate FG can be charged by high voltage. Charged transistor non-conducting if row selected. Discharging with UV light. P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 17 - Universität Dortmund EEPROMS storage cell EEPROM= electrically erasable read only memory. EEPROM needs additional transistor per bit Row (j) FG charged FG not charged Ground Writing and erasing requires just electrical signals P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 18 - Universität Dortmund NOR- and NAND-Flash contact contact P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 [www.samsung.com/Products/Semiconductor/ Flash/FlashNews/FlashStructure.htm] NOR: Transistor between bit line and ground NAND: Several transistor between bit line and ground - 19 - NOR NAND Random access Yes No Erase block Slow Fast Size of cell Larger Small Reliability Larger Smaller Execute in place Yes No Applications Code storage, boot flash, set top box Data storage, USB sticks, memory cards [www.samsung.com/Products/Semiconduc Properties of NORand NANDFlash memories Type/Property tor/Flash/FlashNews/FlashStructure.htm] Universität Dortmund P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 20 - Universität Dortmund Embedded System Hardware Embedded system hardware is frequently used in a loop („hardware in a loop“): actuators P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 For Impressive display technology see http://www.dateconference.com/conference/ 2003/keynotes/index.htm - 21 - Universität Dortmund Digital-to-Analog (D/A) Converters Various types, can be quite simple, e.g.: P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 22 - Universität Dortmund Output voltage no. represented by x Due to Kirchhoff‘s laws: I x3 Due to Kirchhoff‘s laws: Vref R Vref R x2 Vref 2R x1 Vref 4R x0 Vref 8 R 3 xi 2i 3 i 0 V R1 I ' 0 Current into Op-Amp=0: I I ' Hence: Finally: V R1 I 0 V Vref R1 3 R1 i 3 xi 2 Vref nat ( x ) R i 0 8 R P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 23 - Universität Dortmund Possible to reconstruct analog value from digitized value? Let fs be the sampling frequency Input signals with frequency components > fs/2 cannot be distinguished from signals with frequency components < fs/2. Example: Signal: 5.6 Hz; Sampling: 9 Hz 1.5 1 0.5 0 -0.5 -1 -1.5 P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 24 - Universität Dortmund Frequency spectrum of sampled signal Let Xc(): frequency spectrum of the continuous signal, cutoff frequency ΩN=2π fN Let Ωs=2π fs: sampling frequency Let Xs: frequency spectrum of the sampled signal Xs consists of multiple copies of Xc, separated by Ωs Cannot be distinguished in sampled signal Formally: Xs = Xc folded with S, with S: frequency spectrum of clock P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 25 - Universität Dortmund Aliasing If Ωs < 2 ΩN copies of spectrum will overlap (we don’t know the original frequencies any more) No problem for signal reconstruction if this is avoided. P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 26 - Universität Dortmund Nyquist theorem • Analog input to sample-and-hold can be precisely reconstructed from its output, provided that sampling proceeds at double of the highest frequency found in the input voltage. [Nyquist 1928, Shannon, 1949] S/H A/D-converter D/A-converter Interpolate =? Does not capture effect of value quantization: Quantization noise prevents precise reconstruction. P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 27 - Universität Dortmund Embedded System Hardware Embedded system hardware is frequently used in a loop („hardware in a loop“): actuators P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 28 - Universität Dortmund Actuators and output Huge variety of actuators and output devices, impossible to present all of them. Microsystems motors as examples (© MCNC): (© MCNC) P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 29 - Universität Dortmund Actuators and output (2) Courtesy and ©: E. Obermeier, MAT, TU Berlin P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 30 - Universität Dortmund Stepper Motor Stepper motor: rotates fixed number of degrees when given a “step” signal. In contrast, DC motor just rotates when power applied. Rotation achieved by applying specific voltage sequence to coils Controller greatly simplifies this http://www.cise.ufl.edu/~prabhat/Teaching/cis6930-f04/comp4.pdf P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 31 - Universität Dortmund Summary We have covered: sensors (briefly) sample-and-hold and A/D converter circuits communication information processing hardware • ASICs (briefly) • processors • FPGAs • memory D/A-converters actuators (briefly) P. Marwedel, Univ. Dortmund, Informatik 12, 2005/6 - 32 -
© Copyright 2026 Paperzz