THE DEFORESTATION OF L2 JamesMcCauley,MingjieZhao,EthanJ.Jackson,Barath Raghavan, SylviaRatnasamy,ScottShenker UCBerkeley,UESTC,andICSI The Talk • What isAXE? • Why lookatthis? • How doesitwork? • Really?Thisactuallyworks? 2 The What • AnredesignofL2 toreplaceEthernetand SpanningTreeProtocol(anditsvariants) • Targetsare“normal”enterprisenetworks, machinerooms,smallprivateDCs – Not theGoogles,Microsofts,Rackspaces – Notnetworkswithincrediblyhighlyutilization – Not managedbyafull-timeteamofexperts 3 The What: Goals • Plug-and-play – Ifnot,mightaswelljustuseL3 • Usealllinksforshortestpaths – NumberoneshortcomingofSTP • Fastrecoveryfromfailure – NumbertwoshortcomingofSTP? 4 The What: Goals • Plug-and-play – Ifnot,mightaswelljustuseL3 • Usealllinksforshortestpaths – NumberoneshortcomingofSTP • Fast Packet-timescalerecoveryfromfailure – NumbertwoshortcomingofSTP? 5 The What: Assumptions • Failuredetectioncanbefast – Nottraditionallythebottleneck • Controlplane“hellos”weresufficient – Needinterrupt-drivenLFS,BFD,etc. • There’samarketforflood-and-learnL2 – Flooding/learnhassecurityimplications – Noheavyunidirectionaltraffic • Nomulti-accesslinks – Everythingispoint-to-point 6 The Why: Is L2 still a problem? • Stillmanylargely-unmanaged,small/medL2networks! – TwoinourbuildinginBerkeley! • Therehavebeenafewinterestingdevelopments… – SPB,TRILL,SEATTLE,etc. – Providevarioustradeoffs • AXEattemptstostrikeadifferentbalance – Focusontwokeyproblems – Keepingthingsassimpleaspossible(nocontrolplane) 7 The Why: Context STP No STP (Tree) TRILL/SPB Plug-and-play ✓ ✓ ✓ IP (L3) Custom AXE ✓ Shortest Paths Fast Recovery ✓ ✓ ✓ ✓ ✓ No Control Plane ✓ ? ✓ ✓ 8 The How: Extend Ethernet • Basicflood/learnEthernet – Whenyouseeapacket:learn – Whenyoudon’tknowwhattodo:flood • ButAXEdoesnotneedatreetodealwithloops – Meansfloodingworksforhandlingfailurestoo • (becausealternatepathsareimmediatelyavailable) – Meansthatflood/learnfindsshortpaths • (becauseyouhaven’tremovedlinks) 9 The How: Treeless flooding • Howdoyougetaroundtheloopproblem? • Duplicate-packet-detection • Multiplewaysofdoingit • Ourfocus:hash-baseddeduplication filter – Inshort:hashtablewhereyoureplaceuponcollision – Straightforward – Amenabletohardware/P4implementation 10 The How: What changes? • Learningismoresubtle – Sourceaddressseenonmultipleports – Packetsmayevenbegoingbackwards! • Respondingtofailuresismoresubtle – Meanswehavetounlearn(outdated)state 11 The How: Extend Header • Extendthepacketheaderbetween switches – Nonce(per-switchsequencenumber) • Usedforpacketdeduplication – Hopcount • Influenceslearning,also protectsfromloops – Floodedflag:F • Trackswhetherapacketisbeingflooded – Learnableflag:L • Trackswhetherpacketcanbelearnedfrom 12 The How: Separate queues • Switcheshavefloodqueueandnormalqueue – TheFloodedflagintheheaderdetermineswhich – Floodqueuehashigherpriorityandisshorter – Normalqueuesized…normally • Intuition: – Deliveringfloodsquicklystops floodingquickly – Deduplication onlyappliestofloods,keeping fewerfloodsinflightmakesdedup easier 13 The How: Overview • Extendpacketheader – Nonce,HopCount,Flooded/Learnableflags • Learning/UnlearningPhase – MaylearnportandHCtosrc – Mayunlearnpathtodst iftroublewasobserved • OutputPhase – Ifpacketisaduplicate:drop – Ifunknown-dst/path-failed/already-flooding:flood – Otherwiseforwardaccordingtotable 14 The How: Example • AsendsapackettoB(L:True) – DestinationBunknown;packetfloodedfromfirsthop(F:True) • AllswitcheslearnhowtoreachA • Bsendsto A(L:True) – DirectpathfollowingtableentriestoA(F:False) • Switchesalongpathlearnhowtoreach B • • Linkfails A sendsanotherpackettoB (L:True) – Followsalongpath…(F:False) – ..untilithitsfailure(L:False F:True) – Switchfloodspacketoutall ports(evenbackwards) • • Floodedpacketreaches B (Successfuldelivery!) • AnotherduplicateoffloodedpacketreachesA’sfirsthop;unlearnB Asendsanother packettoB (L:True) – DestinationBunknown;packetfloodedfromfirsthop(F:True) • AllswitcheslearnhowtoreachAagain 15 Really? Preliminaries • Howmuchfloodingdofailurescause? Campus 0.00% 0.02% 0.04% 0.06% 0.08% 0.10% 0.12% 0.14% Cluster 0.16% 0.18% • Howbigdoesthededuplication filterneedtobe? – Lessthan1,000entriesinoursimulations • Doesitrecoverfromoverload? – Yes * 16 Really? Overview • Thinkingbacktothatmatrix… – Wewantplugandplay – Wetosupportshortestpathsusingalllinks – Wedon’twanttohaveacontrolplane – Packet-timescalerecoveryfromfailures 17 Really? Failure benchmark • Omniscient,randomized,shortest-pathrouting • Failure→Adjustabledelay→Fixroutes • Delayofzeroisoptimalrouting /anupperbound • Nonzerodelaymeanttoroughlysimulate… – OSPF,IS-IS,TRILL,SPB,etc. – ..withoutneedingtomodeleachoneindetail • Randomshortest-costtreerootedateachdestination • Note:wedon’tcompareourselvestoSTPatall 18 Really? Failure recovery - UDP • Sendtrafficonnetwork withhighfailurerate 160ms 0.27% 80ms 0.19% 40ms 0.13% 20ms • Metricisunnecessary deliveryfailures – packetsthatweren’t deliveredeventhough optimalroutingcould havedeliveredthem 0.08% 10ms 0.05% 5ms AXE 0.03% 0.00% 0.00% 0.05% 0.15% 0.20% 0.25% 0.30% Delivery Failures (Campus) 160ms 0.021% 80ms 0.011% 0.008% 40ms 20ms • AXEhasno unnecessary deliveryfailures 0.10% 0.005% 10ms 0.003% 0.002% 5ms AXE 0.000% 0.000% 0.005% 0.010% 0.015% 0.020% 0.025% Delivery Failures (Cluster) 19 Really? Failure recovery - TCP 160ms 0.25% 80ms 0.11% 40ms • Similarsetup,butwithTCP 0.14% 20ms 0.06% 10ms 0.01% 5ms 0% AXE 0% 0.0% 0.1% 0.1% 0.2% 0.2% 0.3% 0.3% Delayed FCTs (Campus) 160ms 0.036% 80ms 0.020% 40ms 0.015% 20ms 0.010% 10ms 0.007% 5ms AXE • Metricisnumberofflows withsignificantlyworseFCT thanoptimalrouting • AXEhasno significantly worseFCTs 0.002% 0% 0.00% 0.01% 0.01% 0.02% 0.02% 0.03% 0.03% 0.04% 0.04% Delayed FCTs (Cluster) 20 The End: Not Mentioned Here • MulticastAXE – Onanychange(failure;join),flood+dedup andprune – Floodedpacketshavealldataneededtobuildtree • AXEwithHedera – UseAXEformice&recovery – CentralizedSDNroutingforelephantflows • P4implementation – AXEisexpressibleinP4 – Performanceonrealhardwareisopenquestion 21 THE END 22
© Copyright 2026 Paperzz