ESE680-002 (ESE534): Computer Organization Day 16: March 14, 2007 Interconnect 4: Switching 1 Penn ESE680-002 Spring2007 -- DeHon Previously • Used Rent’s Rule characterization to understand wire growth IO = c Np • Top bisections will be (Np) • 2D wiring area (Np)(Np) = (N2p) 2 Penn ESE680-002 Spring2007 -- DeHon We Know • How we avoid O(N2) wire growth for “typical” designs • How to characterize locality • How we might exploit that locality to reduce wire growth • Wire growth implied by a characterized design 3 Penn ESE680-002 Spring2007 -- DeHon Today • Switching – Implications – Options 4 Penn ESE680-002 Spring2007 -- DeHon Switching: How can we use the locality captured by Rent’s Rule to reduce switching requirements? (How much?) 5 Penn ESE680-002 Spring2007 -- DeHon Observation • Locality that saved us wiring, also saves us switching IO = c p N 6 Penn ESE680-002 Spring2007 -- DeHon Consider • Crossbar case to exploit wiring: split into two halves, connect with limited wires N/2 x N/2 crossbar each half N/2 x (N/2)p connect to bisection wires 2(N2/4) + 2(N/2)p+1 < N2 7 Penn ESE680-002 Spring2007 -- DeHon Recurse • Repeat at each level – form tree 8 Penn ESE680-002 Spring2007 -- DeHon Result • If use crossbar at each tree node – O(N2p) wiring area • for p>0.5, direct from bisection – O(N2p) switches • top switch box is O(N2p) –2 Wtop×Wbot+(Wbot)2 –2 (Np×(N/2)p)+(N/2)2p –N2p(1/2p + 1/22p) 9 Penn ESE680-002 Spring2007 -- DeHon Result • If use crossbar at each tree node – O(N2p) wiring area • for p>0.5, direct from bisection – O(N2p) switches • top switch box is O(N2p) –N2p(1/2p + 1/22p) • switches at one level down is –2(N/2)2p(1/2p + 1/22p) –2(1/2p)2 ( N2p(1/2p + 1/22p)) p 2 –2(1/2 Penn ESE680-002 Spring2007 -- DeHon ) previous level 10 Result • If use crossbar at each tree node – O(N2p) wiring area • for p>0.5, direct from bisection – O(N2p) switches • top switch box is O(N2p) –N2p(1/2p + 1/22p) • switches at one level down is –2(1/2p)2 previous level –(2/22p)=2(1-2p) –coefficient < 1 for p>0.5 Penn ESE680-002 Spring2007 -- DeHon 11 Result • If use crossbar at each tree node – O(N2p) switches • top switch box is O(N2p) • switches at one level down is 2(1-2p) previous level • Total switches: N2p (1+2(1-2p) +22(1-2p) +23(1-2p) +…) get geometric series; sums to O(1) N2p (1/(1-2(1-2p) )) Penn ESE680-002 Spring2007 -- DeHon = 2(2p-1) /(2(2p-1) -1) N2p 12 Good News • Good news – asymptotically optimal – Even without switches, area O(N2p) • so adding O(N2p) switches not change 13 Penn ESE680-002 Spring2007 -- DeHon Bad News • Switches area >> wire crossing area – Consider 8l wire pitch crossing 64 l2 – Typical (passive) switch 2500 l2 – Passive only: 40× area difference • worse once rebuffer or latch signals. • …and switches limited to substrate – whereas can use additional metal layers for wiring area 14 Penn ESE680-002 Spring2007 -- DeHon Additional Structure • This motivates us to look beyond crossbars – can depopulate crossbars on up-down connection without loss of functionality? 15 Penn ESE680-002 Spring2007 -- DeHon Can we do better? • Crossbar too powerful? – Does the specific down channel matter? • What do we want to do? – Connect to any channel on lower level – Choose a subset of wires from upper level • order not important 16 Penn ESE680-002 Spring2007 -- DeHon N choose K • Exploit freedom to depopulate switchbox • Can do with: – K(N-K+1) switches – Vs. K N – Save ~K2 17 Penn ESE680-002 Spring2007 -- DeHon N-choose-M • Up-down connections – only require concentration • choose M things out of N – i.e. order of subset irrelevant • Consequent: – can save a constant factor ~ 2p/(2p-1) • (N/2)p x Np vs (Np - (N/2)p+1)(N/2)p • P=2/3 2p/(2p-1) 2.7 • Similary, Left-Right – order not important reduces switches Penn ESE680-002 Spring2007 -- DeHon 18 Multistage Switching 19 Penn ESE680-002 Spring2007 -- DeHon Multistage Switching • We can route any permutation w/ less switches than a crossbar • If we allow switching in stages – Trade increase in switches in path – For decrease in total switches 20 Penn ESE680-002 Spring2007 -- DeHon Butterfly • Log stages • Resolve one bit per stage bit3 Penn ESE680-002 Spring2007 -- DeHon bit2 bit1 bit0 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 21 1110 1111 What can a Butterfly Route? 0000 1000 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 • 0000 0001 • 1000 0010 22 Penn ESE680-002 Spring2007 -- DeHon What can a Butterfly Route? 0000 1000 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 • 0000 0001 • 1000 0010 23 Penn ESE680-002 Spring2007 -- DeHon Butterfly Routing • Cannot route all permutations – Get internal blocking • What required for non-blocking network? 24 Penn ESE680-002 Spring2007 -- DeHon Decomposition 25 Penn ESE680-002 Spring2007 -- DeHon Decomposed Routing • Pick a link to route. • Route to destination over red network • At destination, – What can we say about the link which shares the final stage switch with this one? – What can we do with link? • Route that link – What constraint does this impose? – So what do we do? 26 Penn ESE680-002 Spring2007 -- DeHon Decomposition • Switches: N/2 2 4 + (N/2)2 < N2 27 Penn ESE680-002 Spring2007 -- DeHon Recurse If it works once, try it again… 28 Penn ESE680-002 Spring2007 -- DeHon Result: Beneš Network • 2log2(N)-1 stages (switches in path) • Made of N/2 22 switchpoints – (4 switches) • 4Nlog2(N) total switches • Compute route in O(N log(N)) time 29 Penn ESE680-002 Spring2007 -- DeHon Beneš Network Wiring • Bisection: N • Wiring O(N2) area (fixed wire layers) 30 Penn ESE680-002 Spring2007 -- DeHon Beneš Switching • Beneš reduced switches – N2 to N(log(N)) – using multistage network • Replace crossbars in tree with Beneš switching networks 31 Penn ESE680-002 Spring2007 -- DeHon Beneš Switching • Implication of Beneš Switching – still require O(W2) wiring per tree node • or a total of O(N2p) wiring – now O(W log(W)) switches per tree node • converges to O(N) total switches! – O(log2(N)) switches in path across network • strictly speaking, dominated by wire delay ~O(Np) • but constants make of little practical interest except for very large networks 32 Penn ESE680-002 Spring2007 -- DeHon Better yet… • • • • Believe do not need Beneš on the up paths Single switch on up path Beneš for crossover Switches in path: log(N) up + log(N) down + 2log(N) crossover = 4 log(N) = O(log (N)) 33 Penn ESE680-002 Spring2007 -- DeHon Linear Switch Population 34 Penn ESE680-002 Spring2007 -- DeHon Linear Switch Population • Can further reduce switches – connect each lower channel to O(1) channels in each tree node – end up with O(W) switches per tree node 35 Penn ESE680-002 Spring2007 -- DeHon Linear Switch (p=0.5) 36 Penn ESE680-002 Spring2007 -- DeHon Linear Population and Beneš • Top-level crossover of p=1 is Beneš switching 37 Penn ESE680-002 Spring2007 -- DeHon Beneš Compare • Can permute stage switches so local shuffles on outside and big shuffle in middle 38 Penn ESE680-002 Spring2007 -- DeHon Linear Consequences: Good News • Linear Switches – O(log(N)) switches in path – O(N2p) wire area – O(N) switches – More practical than Beneš crossover case 39 Penn ESE680-002 Spring2007 -- DeHon Linear Consequences: Bad News • Lacks guarantee can use all wires – as shown, at least mapping ratio > 1 – likely cases where even constant not suffice • expect no worse than logarithmic • Finding Routes is harder – no longer linear time, deterministic – open as to exactly how hard 40 Penn ESE680-002 Spring2007 -- DeHon Mapping Ratio • Mapping ratio says – if I have W channels • may only be able to use W/MR wires –for a particular design’s connection pattern • to accommodate any design –forall channels physical wires MR logical Penn ESE680-002 Spring2007 -- DeHon 41 Mapping Ratio • Example: – Shows MR=3/2 – For Linear Population, 1:1 switchbox 42 Penn ESE680-002 Spring2007 -- DeHon Area Comparison Both: p=0.67 N=1024 M-choose-N Penn ESE680-002 Spring2007perfect -- DeHon map Linear MR=2 43 Area Comparison • Since – switch >> wire • may be able to tolerate MR>1 • reduces switches – net area savings • Empirical: – Never seen greater than 1.5 M-choose-N perfect map Penn ESE680-002 Spring2007 -- DeHon Linear MR=2 44 Expanders 45 Penn ESE680-002 Spring2007 -- DeHon Expander Theory (a,b)-expansion – Any group of size k=aN will expand connect to a group of size bk=baN in each logical direction [Arora, Leighton, Maggs SIAM Journal of Comp. v25n3p600 1996] 46 Penn ESE680-002 Spring2007 -- DeHon Expander Idea • IF we can achieve expansion – Can guarantee non-blocking at each stage • i.e. – Guarantee use less than aN – Guarantee connections to more stuff in next level – Since baN>aN available in next level • Guaranteed to be an available switch 47 Penn ESE680-002 Spring2007 -- DeHon Dilated Switches • Have multiple outputs per logical direction – Dilation: number of outputs per direction – E.g. radix 2 switch w/ 4 outputs • 2 per direction • Dilation 2 Up (0) Down (1) 48 Penn ESE680-002 Spring2007 -- DeHon Dilated Switches allow Expansion • On Right – Any pair of nodes connects to 3 switches • Strictly speaking must have d>2 for expansion 49 Penn ESE680-002 Spring2007 -- DeHon Random Wiring • Random, dilated wiring for butterfly can achieve [Upfal/STOC 1989, – For tree…22p (?) Leighton/Leiserson/Kravets MIT/LCS/RSS 8 1990] 50 Penn ESE680-002 Spring2007 -- DeHon Constraints • Total load should not exceed a of net – L=mapping ratio (light loading factor) aLW = number into each subtree aLW W/2p L 1/(2pa) • Cannot expand past the size of subtree baNN/2p ba 1/2p 51 Penn ESE680-002 Spring2007 -- DeHon Extra Switches • Extra switch factor: d×L • Try: b=2, a=1/10 – d=8 – dL40 (p=1) • Try: b=1.01, a=1/4 d=6, L2 dL12 (p=1) b=1.01, a=1/4 d=6, L2.8 dL17 (p=0.5) 52 Penn ESE680-002 Spring2007 -- DeHon Putting it Together • Base, linear-population trees have O(N) switches • Make larger by a factor of L (linear factor) • Dilated version have a factor of d more switches • Randomly wired expander – Can have O(N) switches – Guarantee routes – Constants < 100 (looks like < 20) – Open: how tight can make it? 53 Penn ESE680-002 Spring2007 -- DeHon Big Ideas [MSB Ideas] • In addition to wires, must have switches – Have significant area and delay • Rent’s Rule locality reduces – both wiring and switching requirements • Naïve switches match wires at O(N2p) – switch area >> wire area – prevent benefit from multiple layers of metal 54 Penn ESE680-002 Spring2007 -- DeHon Big Ideas [MSB Ideas] • Can achieve O(N) switches – plausibly O(N) area with sufficient metal layers • Switchbox depopulation – save considerably on area (delay) – will waste wires – May still come out ahead (evidence to date) 55 Penn ESE680-002 Spring2007 -- DeHon
© Copyright 2026 Paperzz