1 MODELING AND EVALUATION OF CHIP-TO-CHIP SCALE SILICON PHOTONIC NETWORKS Robert Hendry, Dessislava Nikolova, Sébastien Rumley, Keren Bergman Columbia University HOTI 2014 2 Chip-to-chip optical networks • Projected chip I/O bandwidth: tens of Tb/s • Chip I/O bandwidth limited by pin count, data rate • Promising solution: silicon photonics • Dense bandwidth via WDM • High data rates • Energy-distance independence in fiber 3 Outline • Silicon photonic chip-to-chip networks • Characterizing loss and WDM capacity • Modeling power • Determining network performance • Conclusions 4 Microring-based silicon photonic links • Microrings • Modulation • Switching • Filtering Microring modulator (Cornell) Demultiplexing filter (Kotura/ Oracle) • Other optical devices: • Lasers, couplers, integrated photodetectors Photodetectors 5 Chip-to-chip optical networks • “Chip-to-chip” low radix, high bandwidth • Chose two architectures to represent extremes of design space Full mesh architecture Switched architecture 6 Full mesh • One link at each source for each destination PNI 0 Tx To PNI 1 PNI 1 Rx From PNI 0 To PNI 2 To PNI 3 From PNI 2 From PNI 3 7 Switched architecture • One input and output link per PNI 2x2 switch “Through” state “Drop” state PNI 0 Tx Optical switch fabric PNI 1 Rx 8 Comparing topologies • Laser power is the largest contributor to overall power in the network • 4.5%, X. Zheng, et al. “Efficient WDM laser sources towards terabytes/s silicon photonic interconnects.” Journal of Lightwave Technology, vol. 31, no. 15, 2013. • Assume lasers are always on • Laser stabilization time on the order of microseconds • Context: small packets, short inter-arrivals • Energy efficiency closely related to utilization of laser sources • Full mesh expectation: • No contention, lower queuing latency • More lasers, higher power, poor efficiency with load is low • Switched architecture expectation: • Contention, higher queuing latency • Resource sharing improves utilization and therefore efficiency 9 Shared input/output waveguides • Another way to share laser sources PNI 0 Rx PNI 0 Tx PNI 1 Rx PNI 1 Tx … Tx Tx Tx Tx Tx Tx Tx Tx Tx Tx Tx Tx Tx Tx Tx Tx Rx Rx Rx Rx Rx Rx Rx Rx Rx Rx Rx Rx Rx Rx Rx Rx Full mesh, 2-way sharing PNI PNI Full mesh Tx Rx Rx Tx Tx Rx Rx Tx Tx Rx Rx Tx Tx Rx Rx Tx • Sacrificing performance for better utilization 10 Shared input/output waveguides • Another way to share laser sources PNI 0 Rx PNI 0 Tx PNI 1 Rx PNI 1 Tx Benes Rx Rx Rx Benes, 2-way sharing PNI Rx Tx Tx Tx Tx 2x2 2x2 2x2 Tx 2x2 Tx 2x2 Tx 2x2 Tx 2x2 PNI … Rx Rx Rx Rx 11 Design space • Topology • Benes • Full mesh • Sharing • No sharing, or two-way sharing • Network radix • 4, 8, or 16 • Goal: to find optimal topologies for given bisectional bandwidth requirements • Ex: “Benes-4T-2S”, “FM-16T-1S” 12 Outline • Silicon photonic chip-to-chip networks • Characterizing loss and WDM capacity • Modeling power • Determining network performance • Conclusions 13 Determining worst-case loss • Combine losses of all devices along worst-case path of light Optical switch fabric PNI 0 Tx Waveguide: .92 dB/cm PNI 1 Rx Filter: .6 – 2.8 dB Coupler: 1 dB Switch through/drop: 0.07 – 3.3 dB Modulators: 5.4 – 7.85 dB 14 Complexity vs. link capacity 100 mW (20 dBm) Channel power Channel power Channel power Total loss Total loss Total loss Required input power Non-linear effects Optical power budget 6.3 uW (-22 dBm) Below receiver sensitivity 10 Gb/s per wavelength channel, OOK modulation Intermodulation crosstalk limits WDM capacity to 125 wavelengths • • • • Assuming 50nm spectrum K. Padmaraju, et al. “Intermodulation Crosstalk Characteristics of WDM Silicon Microring Modulators.” IEEE Photonics Letters, vol. 26, no. 14, 2014. 15 Peak Bisectional Bandwidth • Loss ! maximum wavelengths per link • Maximum wavelengths x 10 Gb/s x number of links in bisection ! peak bisectional bandwidth Benes-4T-1S Benes-8T-1S Benes-16T-1S Benes-4T-2S Benes-8T-2S Benes-16T-2S FM-4T-1S FM-8T-1S FM-16T-1S FM-4T-2S FM-8T-2S FM-16T-2S 1 Tb/s More devices (switches), more loss, less bandwidth Simpler links, less loss, more bandwidth 10 Tb/s 100 Tb/s T = number of PNIs, S = number of Tx/Rx per link 16 Peak Bisectional Bandwidth • Loss ! maximum wavelengths per link • Maximum wavelengths x 10 Gb/s x number of links in bisection ! peak bisectional bandwidth Benes-4T-1S Benes-8T-1S Benes-16T-1S Benes-4T-2S Benes-8T-2S Benes-16T-2S FM-4T-1S FM-8T-1S FM-16T-1S FM-4T-2S FM-8T-2S FM-16T-2S 1 Tb/s More links, but also higher radix switch, so bandwidth grows slowly, or not at all More links, more bisectional bandwidth 10 Tb/s 100 Tb/s T = number of PNIs, S = number of Tx/Rx per link 17 Outline • Silicon photonic chip-to-chip networks • Characterizing loss and WDM capacity • Modeling power • Determining network performance • Conclusions 18 Power modeling • Microring tuning, trimming • Thermal fluctuations • Imperfect fabrication Device Type/Origin Power/Device (mW) Thermal 0.875 Driver circuitry 1.35 Dissipation in ring 0.1 Switch Thermal 3.5 Filter Thermal 0.875 Detector Static 3.95 Laser Static 1250 Modulator • Laser power • A function of loss and number of wavelengths used • Static dissipation in photodetectors • Dynamic modulation, switching power • Not modeling network interfaces 19 Outline • Silicon photonic chip-to-chip networks • Characterizing loss and WDM capacity • Modeling power • Determining network performance • Conclusions 20 Impact of layout on network performance • Poisson arrivals, uniform random destination • Fixed message size (256B) • Assume we have an arbitration scheme that can reach 100% utilization across the chip-scale network Benes-4T-1S 2 10 Benes-8T-1S Benes-16T-1S Benes-4T-2S Benes-8T-2S Benes-16T-2S 1 10 2 10 3 4 10 10 Offered bandwidth (Gb/s) 5 10 Average network latency (ns) Average network latency (ns) • Models indicate queuing and head-to-tail latency FM-4T-1S FM-8T-1S 2 10 FM-16T-1S FM-4T-2S FM-8T-2S FM-16T-2S 1 10 2 10 3 4 10 10 Offered bandwidth (Gb/s) 5 10 21 Impact of layout on energy per bit 4 4 10 10 Benes-4T-1S 2 Benes-16T-2S 10 1 10 0 Energy per bit (pJ) Energy per bit (pJ) 3 10 10 2 10 FM-4T-1S Steady decrease in energy per bit Benes-8T-1S 3 10 because mostBenes-16T-1S power is “static.” Benes-4T-2S More load means more utilization. Benes-8T-2S FM-8T-1S FM-16T-1S FM-4T-2S FM-8T-2S 2 FM-16T-2S 10 1 10 0 3 4 10 10 Offered bandwidth (Gb/s) 5 10 10 2 10 3 4 10 10 Offered bandwidth (Gb/s) 5 10 • The best configuration in terms of energy per bit depends on offered load • However, these figures hide latency 22 Impact of layout on energy per bit 4 4 10 10 Benes-4T-1S Benes-8T-1S Benes-16T-1S Benes-4T-2S Benes-8T-2S 2 Benes-16T-2S 10 1 10 0 FM-8T-1S 3 Energy per bit (pJ) Energy per bit (pJ) 3 10 10 2 10 FM-4T-1S 10 FM-16T-1S FM-4T-2S FM-8T-2S 2 FM-16T-2S 10 1 10 0 3 4 10 10 Offered bandwidth (Gb/s) 5 10 10 2 10 3 4 10 10 Offered bandwidth (Gb/s) 5 10 • The best configuration in terms of energy per bit depends on offered load • However, these figures hide latency 23 Pareto optimality of topologies 100 Energy per bit (pJ) 0.4 Tb/s 1 Tb/s 4 Tb/s Benes 40 Tb/s Full-mesh 4T-1S 4T-2S 10 8T-1S 8T-2S 8T-4S 1 16T-1S 10 100 10 100 Average netw ork latency (ns) 10 100 10 100 Architectures with optimal trade-off at given load • Can move to a more power-consuming topology to improve latency, or vice versa 16T-2S 16T-4S 24 Pareto optimality of topologies 100 Energy per bit (pJ) 0.4 Tb/s 1 Tb/s 4 Tb/s Benes 40 Tb/s Full-mesh 4T-1S 4T-2S 10 8T-1S 8T-2S 8T-4S 1 16T-1S 10 100 10 100 Average netw ork latency (ns) 10 100 10 100 • Low loaded networks inevitably suffer from higher energy per bit 16T-2S 16T-4S 25 Conclusions • Developed methodology for navigating design space • Using cross-layer analysis, we characterized an upper bound on the energy efficiency of silicon photonic networks at the chip-to-chip scale • Trend: For (relatively small scale) silicon photonic networks, the mechanisms that accommodate for low loads (i.e. resource sharing) degrade energy efficiency
© Copyright 2026 Paperzz