Computer Engineering of Wave Machines for Seismic Modeling and Seismic Migration R. Phillip Bording February 17, 2004 Husky Energy Chair in Oil and Gas Research Memorial University of Newfoundland 0 Max Address M U N - February 17, 2005 - Phil Bording 1 Session 1 History of Design Tyco Brahe Napier Charles Babbage – mechanical design John Atanasoff – Storage – spinning capacitor - Konrad Zuse - Floating Point Mauchley and Ekert von-Neumann Harvard memory – code memory - data Princeton memory code and data Session 2 Current Design Issues Scaling laws Moore’s Law Transistors – VLSI Memory – Technology Division of Design The memory Challenge The processor Challenge The ILLIAC – PEPE IBM 7094 IBM 360/44 IBM 360/95 Array Processors the software of array processor calls Programming Models vectors shared memory distributed memory Lamda Rules M U N - February 17, 2005 - Phil Bording 4 Division of design Company A Memory Memory Weak Link ALU ALU One Company Company B M U N - February 17, 2005 - Phil Bording 5 Moore’s Laws Every 18 months the density of transistors on a VLSI chip doubles The investments of $ doubles with every new VLSI plant M U N - February 17, 2005 - Phil Bording 6 Illiac 8 X 8 Processors Nearest Neighbor Connections M U N - February 17, 2005 - Phil Bording 7 Parallel Ensemble Processing Elements - PEPE Radar Processing Computer Associative Computing Data Outputs P0 .... Pn-3 Pn-2 Pn-1 Pn Data Inputs M U N - February 17, 2005 - Phil Bording 8 IBM Machines Early 1960’s 7094, 36 bit arithmetic 1600 and 1400 processors completely different Middle 1960’s New Machine – IBM 360 36 bit words, but memory parity was added 8 bit byte + 1 bit parity Uniform business machine architectures 32 and 64 bit floating point Not any industry standard for format of floating point M U N - February 17, 2005 - Phil Bording 9 Array Processors IBM and CDC designed DMA processors – Direct Memory Access Frees the main processor to compute Allows separate simple processors to do the i/o The idea translated into attached processors for arithmetic processing M U N - February 17, 2005 - Phil Bording 10 Array Processors Arrays of data are moved to a local very high speed memory – fast registers Arithmetic is performed by special instructions passed to array processor CPU Array Processor M U N - February 17, 2005 - Phil Bording 11 Software Design Issues Vector Programming Cache Programming Message Passing Programming NUMA Programming Grid Programming ALL of these memory operations have a Fixed Cost Code Performance Improvements M U N - February 17, 2005 - Phil Bording are dominated by fixed costs 12 Hardware Design Issues 10 Years equals 100 Fold Speedup Memory Latency – cost of getting the first word is a constant Wires have failed to scale Bigger cache memories are slower Code Performance Improvements are dominated by fixed costs M U N - February 17, 2005 - Phil Bording 13 Linear Address Space 0 Max Address Address Pointer Latency is the time to access the first word Bandwidth is the rate of accessing successive words M U N - February 17, 2005 - Phil Bording 14 von Neumann Architecture Princeton Address Pointer Arithmetic Logic Unit (ALU) Data/Instructions Memory Pc = Pc + 1 Program Counter Featuring Deterministic Execution M U N - February 17, 2005 - Phil Bording 15 Cache Memory Architecture Main Memory is large and slow. Cache is much smaller and much faster. Control logic control keeps the main memory coherent. C A C H E C O N T R O L Memory Cache Memory Address Pointer Featuring Non-Deterministic Execution M U N - February 17, 2005 - Phil Bording 16 Cache Memory - Three Levels Architecture Memory MultiGigabytes 2 Gigahertz Clock Large and Slow 160 X Cache Control Logic 2X 8X L1 Cache Memory L2 Cache Memory 32 Kilobytes 16X L3 Cache Memory 128 Kilobytes 16 Megabytes Featuring Really Non-Deterministic M U N - February 17, 2005 - Phil Execution Bording Address Pointer 17 Programming Models for Parallel Computing M U N - February 17, 2005 - Phil Bording 18 Distributed Computing Message Passing Interface Program Address Spaces 0 Max 0 Max 0 Max 0 Max Multiple Address Pointers M U N - February 17, 2005 - Phil Bording 19 Distributed Computing with Message Passing Program Address Spaces Messages Left and Right Multiple Address Pointers M U N - February 17, 2005 - Phil Bording 20 M U N - February 17, 2005 - Phil Bording 21 Multi-Threading OpenMP Programming Model Global Program Address Space Local 0 Local Local Local n-1 n 2n-1 2n 3n-1 3n 4n-1 Multiple Address Pointers M U N - February 17, 2005 - Phil Bording Address and Cache Bus with Conflict Resolution 22 Uniqueness of Store Multi-Threading Program Address Space 0 Duplicate Pointers Multiple Address Pointers to the same Location – Conflict on storing a result So whoMisU Nmanaging the multiple pointers? - February 17, 2005 - Phil Bording responsibility. It is the programmers 23 Multiple Bank Memory Systems Memory Banks Bank 0 Starting Address Mod 4 1 +1 +N 2 3 +2 +2N +3 +3N Vector Programming Model M U N - February 17, 2005 - Phil Bording 24
© Copyright 2025 Paperzz