ECE 8813a: Design & Analysis of Multiprocessor Interconnection Networks Sudhakar Yalamanchili School of Electrical and Computer Engineering © Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Course Content: Goals • Coverage of basic concepts for high performance multiprocessor and many core interconnection networks Primarily link & data layer communication protocols Router architectures • Understand established and emerging microarchitecture concepts and implementations • Formal Analysis: deadlock & livelock • Optimization Topology, power, latency, bandwidth, wiring, pin-out ECE 8813a (2) Course Outline Case Studies • • • Operation through multiple switches: Topologies, Routing, and Optimization Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock freedom Operation through a single switch: Router micro-architectures Buffering, arbitration, scheduling, datapath Operation of a single link: switching and flow control ECE 8813a (3) Optimization: technology, congestion, reliability • Course Administration • Instructor: Professor Sudhakar Yalamanchili Class webpage for contact information www.ece.gatech.edu/users/sudha/academic/class/Network s/Spring2012 Class material drawn from o o o “Interconnection Networks: An Engineering Approach”, J. Duato, S. Yalamanchili and L. Ni, Morgan Kaufmann (pubs.), 2003 “Principles and Practices and Interconnection Networks,” W. J. Dally and B. Towles, Morgan Kaufmann (pubs). Journal and Conference publications Publicly available simulation infrastructure • Note: This is a 2-3-3 class! ECE 8813a (4) Course Administration (cont.) • Midterm – 20% (February 22nd) • Assignments – 40% Paper review Simulation exercise • Research Project/Final Exam – 40% Paper in conference format Presentation TBD • Last few weeks of the course will be coverage of recent journal and conference papers Depending on timing/class size, one assignment will be a paper presentation ECE 8813a (5) Project Deliverable Structure • Project Proposal: March 12th • Revised Proposal: March 28th • Interim Report: April 11th • Project Final Delivery: April 27th • Project Examination: May 2nd (3rd period) • Formats for each deliverable will be provided in advance ECE 8813a (6) Planned Assigment Schedule • Anticipate 4-5 assignments • Programming assignments every two weeks starting January 16th ECE 8813a (7) Course Outline 1. 2. 3. 4. 5. 6. 7. 8. 9. Flow Control Switching Techniques Topologies Deadlock and Livelock Freedom Router Architectures Routing Algorithms Network Optimization Systems Impact of Networks Case Studies ECE 8813a (8) Technology Trends… bandwidth per router node (Gb/s) 10000 BlackWidow 1000 100 10 1 0.1 1985 1990 1995 2000 2005 2010 Torus Routing Chip Intel iPSC/2 J-Machine CM-5 Intel Paragon XP Cray T3D MIT Alewife IBM Vulcan Cray T3E SGI Origin 2000 AlphaServer GS320 IBM SP Switch2 Quadrics QsNet Cray X1 Velio 3003 IBM HPS SGI Altix 3000 Cray XT3 YARC year Source: W. J. Dally, “Enabling Technology for On-Chip Networks,” NOCS-1, May 2007 ECE 8813a (9) Where is the Demand? Area, power Throughput Performance cost Cables, connectors, transceivers latency, power ECE 8813a (10) Blurring the Boundary • Use heterogeneous multi-core chips for embedded devices IBM Cell gaming Intel IXP network processors AMD Fusion • Use large numbers of multicore processors to build supercomputers NVIDIA Fermi o Tsubame 2.0, Keeneland, Titan, Blue Waters IBM Blue Gene/P • Interconnection networks are central all across the spectrum! ECE 8813a (11) Intel Sandy Bridge • Cache coherent shared memory • Ring interconnect From geeks3D.com ECE 8813a (12) NVIDIA Fermi GF 100 •4 Global Processing Clusters (GPCs) containing 4 SMs each •Each SM has 32 ALUs, 4 SFUs, and 16 LS units •Each ALU has access to 1024 32bit registers (total of 128kB per SM) •Each SM has its own Shared Memory/L1 cache (64kB total) •Unified L2 cache (768kB) •Six 64bit Memory Controllers (total 384bit wide) ALU Streaming multiprocessor (SM) ECE 8813a (13) Intel TeraOp Die • 2D Mesh • Really a test chip • Aggressive speed – multiGHz links From rj3sp.blogspot.com ECE 8813a (14) On-Chip Networks • Why are they different? Abundant bandwidth Power Wire length distribution • Different functions Operand networks Cache memory Message passing ECE 8813a (15) Topologies Binary Hypercube Tori 0000 0001 1110 1111 Multistage Interconnection Fat Tree ECE 8813a (16) Router Microarchitecture: Example High Radix Router Architecture (Cray Inc.) S. Scott, D. Abts, J. Kim, W. J. Dally, “The BlackWidow High-Radix Clos Network,” Proceedings of ICS 2006 ECE 8813a (17) Cray XT3 • 3D Torus interconnect • HyperTransport + Proprietary link/switch technology • 45.6GB/s switching capacity per switch From craysupercomputers.com ECE 8813a (18) Blue Gene/L From nersc.gov From pisces.edu ECE 8813a (19) Blue Gene/L Networks • 3D torus • Dual PPC node processor ASIC • Multiple networks to satisfy distinct communication requirements From http://www.research.ibm.com/journal/rd/492/gara.html ECE 8813a (20) Some Industry Standards • On-Chip Open Core Protocol o Really an interface standard AXI Opportunities for secret sauce/customization • PCI Express • AMD HyperTransport and Intel Quickpath I/F moved on-chip • Infiniband and 10G Ethernet ECE 8813a (21) Lets Get Started……. ECE 8813a (22)
© Copyright 2026 Paperzz