CMS LNL Legnaro Event Builder Prototypes Luciano Berti, Gaetano Maron INFN – Laboratori Nazionali di Legnaro G. Maron CPT Week CERN, 23 April 2001 1 CMS GE Event Builder Components: Hardware: • switch: FoundryNet FastIron • NIC: SysKonnect SK9821 • PC: Supermicro PIII (i840) Software: vxWorks based LNL 15 x 15 Test conditions: • No command or event aggregation (each packet transports a command or data frame relative to a single event) • full data transfer from/to PC memory • recovery from packet loss • fixed fragment sizes are varied 400-4000 bytes G. Maron CPT Week CERN, 23 April 2001 2 CMS Event builder layout 1 RU performance problem found with this configuration 2 3 4 2 7 6 8 Slot 2 Slot 1 1 5 3 4 5 6 9 10 11 12 13 14 15 Slot 3 7 8 9 LNL Slot 4 10 11 12 13 14 15 RUs EVM BUs RUs and BUs distributed in all switch slots: – Part of the traffic localized within the slot – Reduces switch backplane utilization G. Maron CPT Week CERN, 23 April 2001 3 CMS Modified Event Builder layout LNL Request data commands 1 2 3 4 5 6 7 8 Slot 2 Slot 1 9 10 11 12 13 14 15 Slot 3 Slot 4 RUs EVM Fast Ethernet Slot Request data commands 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 BUs - RU fast control message over FE (PCI 32/33) - RU data transfer on GE (PCI 64/66) G. Maron CPT Week CERN, 23 April 2001 4 CMS The GE Event Builder G. Maron CPT Week CERN, 23 April 2001 LNL 5 CMS EB protocol LNL RUs BU EVM 1 2 3 n allocate confirm send cache G. Maron CPT Week CERN, 23 April 2001 6 CMS Concurrent building threads in the same BU LNL RUs BU EVM 1 2 3 n BU thread 1 BU thread 2 BU thread 3 G. Maron CPT Week CERN, 23 April 2001 7 CMS Sequential vs Random reading Sequential reading LNL Random reading RUs BU G. Maron EVM 1 2 3 RUs n BU EVM allocate allocate confirm confirm send send cache cache CPT Week CERN, 23 April 2001 1 2 3 4 5 n 8 CMS “Sliding Window” LNL RUs • multiple send to Rus BU EVM 1 2 3 4 5 n allocate • reduce the total rebuilding time • less events in the Bus confirm send cache • not yet tested G. Maron CPT Week CERN, 23 April 2001 9 CMS Sequential - random reading comparison • • • No difference on performance But more allocated event needed on BUs, All the measurements with random reading Random reading Sequential reading G. Maron LNL CPT Week CERN, 23 April 2001 10 CMS Recovery from Packets loss BU – EVM communication timer BU start EVM LNL BU – RU communication timer Req. EvtId BU start EvtId RU Req. Data EvtData Timeouts 80 - 160 ms timeout G. Maron start Req. EvtId (retry) cancel EvtId timeout start cancel CPT Week CERN, 23 April 2001 Req. Data ( retry ) EvtData 11 CMS 15 x 15 Throughput per node (MB/s) EVB 15x15 performance - Throughput LNL 140 120 100 80 60 40 20 0 0 500 1000 1500 2000 2500 3000 3500 4000 Fragment Size (Byte) • Throughput up to 116 MB/s, ie 93% link speed • no packet loss observed (as expected) G. Maron CPT Week CERN, 23 April 2001 12 CMS EVB Scaling LNL Throughput per node (MB/s) 140 120 100 80 4000 bytes 2000 bytes 60 400 bytes 40 20 0 0 G. Maron 2 4 6 CPT Week 8 10 12 14 16 N CERN, 23 April 2001 13 CMS EVB Performance – Event Rate LNL Fragment rate per node (kHz) 120 100 Nominal fragment size 2kbytes: event rate = 52 kHz 80 60 40 20 15 x 15 0 0 500 1000 1500 2000 2500 3000 3500 4000 4500 fragment size (bytes) G. Maron CPT Week CERN, 23 April 2001 14 CMS Conic Event Builder conic EVB symmetric EVB RU RU Event Manager LNL Event Manager Builder Network Builder Network FU FU FU FU FU FU FU FU FU FU FU FU BU FU • faster ports at Rus • slower ports at BUs FU FU FU G. Maron CPT Week CERN, 23 April 2001 15 CMS Conic Event Builder: Layout RUs Request Data Command 1 FE Slot 1 1 FUs G. Maron 5 3 2 4 7 6 9 8 LNL FE Slot 2 2 3 4 GE Slot 1 EVM 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 CPT Week CERN, 23 April 2001 16 CMS EVB throughput – Conic vs Symmetric LNL 140 Throughput per node (MB/s) 120 100 80 symmetric 15x15 conic 4x40 60 40 20 4 x 40 0 0 500 1000 1500 2000 2500 3000 3500 4000 Fragment Size (Byte) conic EVB: no performance degradation vs symmetric G. Maron CPT Week CERN, 23 April 2001 17 CMS EVB Conic – Scaling LNL 140 1 x 10 2 x 20 Throughput per node (MB/s) 120 100 4000 bytes 2000 bytes 400 bytes 80 60 40 20 0 0 1 2 4 5 N 4 x 40 G. Maron 3 CPT Week CERN, 23 April 2001 18 CMS Conic: RU/FU Throughput ratio LNL 1xn 2xn 4xn G. Maron CPT Week CERN, 23 April 2001 19 CMS To be done and test LNL • variable size events • EB performances with the new implemented “ window” mechanism • latency times measurements • Fault generation with the new implemented Random Error Generator to check the error recovery procedure G. Maron CPT Week CERN, 23 April 2001 20 CMS Multistage Event Builder LNL • All our results have been obtained with a single switch event builder configuration • We propose to extend our tests to a multistage ethernet switches topology and to study the behavior of this configuration. G. Maron CPT Week CERN, 23 April 2001 21 CMS Plain Topology LNL RUs • In the Event Builder application data flows in only one direction • The inter-switch Gigabit Ethernet links are full-duplex • Result : half of the inter-switch bandwidth available is wasted BUs G. Maron CPT Week CERN, 23 April 2001 22 CMS Full Mesh Topology • RU and BU distributed in all the switches LNL RUs BUs • Inter-switch links are used in both direction • Same number of ports of the plain topology • Twice of the bandwidth of the plain topology in the inter-switch links G. Maron CPT Week CERN, 23 April 2001 23 CMS Plain and mesh topology limits LNL • Each couple of switches is connected by a single link • This is a bottleneck if the traffic is not uniformly random • The network is blocking for certain traffic patterns G. Maron CPT Week CERN, 23 April 2001 24 CMS Traffic with patterns LNL • If traffic has patterns (for example this could happens in the case the event builder is performed in steps) it could make sense to introduce an artificial mechanism that randomise the traffic. • This mechanism exist and it is called Universal Routing G. Maron CPT Week CERN, 23 April 2001 25 CMS Universal Routing Reference LNL • Discovered by L.G. Valiant in 1980 • See: M.D May, P.W. Thompson, P.H. Welch NETWORKS,ROUTERS & TRASPUTER available on : http://www.pact.srf.ac.uk/macrame/papers/bluebook.html • Those papers describe the Universal Routing applied to Transputers Networks, a wormhole routing based network • We adapted the same concept to a packet switched network like Gigabit Ethernet G. Maron CPT Week CERN, 23 April 2001 26 CMS Universal Routing with GigaEthernet LNL • Based on Clos topology • Multiple path available between each couple of switches • Every packets is sent to a randomly chosen intermediate switch • The intermediate switch send the packet to the final destination • Full bandwidth between each couple of switches and uniform buffer utilization G. Maron CPT Week CERN, 23 April 2001 27 CMS Universal Routing RUs LNL RUs BUs BUs Full duplex links Half duplex links • Transformation of the CLOS topology to a folded CLOS • The resulting number of ports is the same of the plain topology G. Maron CPT Week CERN, 23 April 2001 28 CMS Large (500x500) multistage GE network (1) • 25 switches with 60 x 1Gb ports 20 BUs 1 1 20 RUs 2 3 • 20 switches with 25 x 1 Gb ports 2 4 40 Ports 25 20 Ports G. Maron LNL CPT Week CERN, 23 April 2001 20 25 Ports 29 CMS Large (500x500) multistage GE network (2) • 25 switches with 40 x 1Gb ports + 2 x 10 Gb uplinks • 2 switches with 25 x 10 Gb ports 20 BUs LNL 1 20 RUs 2 1 2 40 Ports G. Maron CPT Week CERN, 23 April 2001 25 25 Ports 10G 2 Ports 10G 30 CMS Proposal for a multistage event builder demonstrator LNL • Multistage event builders can be emulated using the much cheaper fast ethernet connections and switches. The GE speed is not needed in these topological investigations • The proposal is to have prototypes for: – Full Mesh Topology – Folded CLOS topology with (and without) Universal Routing mechanism G. Maron CPT Week CERN, 23 April 2001 31 CMS Full Mesh 64x64 Event Builder Prototype LNL Missing components 8 RUs - 1 host node 4 Rus / 4 BUs or a mix of them - 32 hosts - 128 FE NICS (56 + 72) - 8 24 FE ports switch 8 BUs 1 3 7 5 G. Maron CPT Week CERN, 23 April 2001 32 CMS Folded CLOS 64x64 Event Builder Prototype 64x64 16 BUs 16 RUs 1 1 3 4 48x48 - 1 host node 3 Rus / 3 BUs or a mix of them - 32 hosts - 96 FE NICS (56 + 40) - 4 36 FE ports switch - 2 24 FE ports FastIron module 3 32 Ports 4 16 Ports G. Maron Missing components - 1 host node 4 Rus / 4 BUs or a mix of them - 32 hosts - 128 FE NICS (56 + 72) - 4 48 FE ports switch - 2 24 FE ports FastIron module + 1 24 FE ports FastIron module 2 2 LNL 16 4 Ports FastIron with 3 24 FE ports mods CPT Week CERN, 23 April 2001 33 CMS Folded CLOS 80x80 Event Builder Prototype 80x80 20 BUs 20 RUs LNL Missing components - 1 host node 4 Rus / 4 BUs or a mix of them - 40 hosts (32+8) - 160 FE NICS (56 +104) - 4 48 FE ports + 2 GE links switch - 1 8 GE (Base SX) ports FastIron module 1 1 2 3 2 40 FE Ports 4 2 GE Ports G. Maron FastIron with 8 GE ports module (1000 BaseT or 1000 BaseSX) 4 Ports CPT Week CERN, 23 April 2001 34 CMS 8x80 Conic Event Builder Prototype 8x80 20 FUs RU1 1 RU2 RU3 2 RU4 LNL Missing components - 1 host node 4 FUs - 20 hosts - 80 FE NICS (56 + 24) - 4 24 (48) FE ports + 2 GE up links switch - 2 8 GE (Base SX) ports FastIron module RU5 3 RU6 RU7 20 FE Ports 4 RU8 2 GE Ports G. Maron FastIron with 2 8 GE ports modules (1000 BaseT or 1000 BaseSX) CPT Week CERN, 23 April 2001 35 CMS Material for the event builder multistage prototypes LNL Mesh 64x64 - 72 FE NICs - 8 24 FE ports Folded CLOS 80x80 - 8 PCs - 104 FE NICS - 4 48 FE ports with 2 GE uplinks 1000 baseT if the 1000 baseT uplinks are not available: 1) Folded CLOS 64x64: 72 FE NICs 4 48 FE ports switch 1 24 FE ports FastIron module G. Maron CPT Week 2) Folded CLOS 48x48: 40 FE NICs 4 >36 FE ports switch CERN, 23 April 2001 36
© Copyright 2026 Paperzz