Data Center Network Architectures Jellyfish vs. Fat-tree Weiye Hu ELEN6909 Network Algorithms and Dynamics May 13th 2016 Data Centers Many applications must exchange information with remote nodes to proceed with their local computation. -- MapReduce -- Applications running on cluster-based file systems -- Web search engine -- Parallel applications The principle bottleneck in large-scale data centers is often inter-node communication bandwidth. Typical Architecture Today Core Aggregation Edge What do we want? A host can communicate with any other host at high bandwidth (e.g. the full bandwidth of its local network interface). How to improve inter-node bandwidth? Solutions Use specialized hardware and communication protocols Use high-performance switches -- Very expensive -- The aggregate bandwidth may still be very low Use new network topologies that meet the following goals: -- Scalable interconnection bandwidth -- Economies of scale -- Backward compatibility Fat-tree & Jellyfish Fat-tree (tree-like topology) Jellyfish (random graph) Fat-tree Fat-tree Topology Core Aggregation Edge k-ary tree (here k = 4) Fat-tree Topology k-ary tree (here k = 4) Core Aggregation Edge k-port switches k pods, each containing two layers of k/2 switches Each switch in the lower layer is connected to k/2 hosts k2/4 core switches Each core switch has one port connected to each of k pods Supports k3/4 hosts (determined by k) Why fat-tree can deliver high bandwidth? traditional topology fat-tree BW 1 BW 16 2 16 4 16 8 16 16 16 Fat-tree vs. Traditional Topology Advantages Fat-tree achieves lower cost while delivering high bandwidth. e.g. when supports 27,648 hosts at full bandwidth -- Fat-tree (k = 48): costs $8.64M -- traditional solution: costs $37M Fat-tree requires no changes to end hosts. -- is fully TCP/IP compatible Fat-tree requires only a little modifications to the switches. Fat-tree vs. Traditional Topology Traditional topology Fat-tree Incremental Expansion What is incremental expansion? Adding servers and capacity incrementally to the data center. What do we want? The topology allows construction of arbitrary-sized networks. Add new components to the existing topology without replacing or upgrading existing components. Does fat-tree allow incremental expansion? Does fat-tree allow construction of arbitrary-sized networks? No! The number of hosts a fat-tree can support is determined by k, i.e., the port-count of the switches it uses. e.g., when k = 48, the fat-tree can only be built at size 27,648. (k3/4 = 483/4 = 27,648) Does fat-tree allow adding new components without replacing or upgrading existing components? No! Even small incremental expansion would require replacing every existing switch to maintain the fat-tree structure. Structure hinders incremental expansion Fat-tree The entire structure is completely determined by the port-count of the switches available. Adding servers while preserving the structural properties would require replacing a large number of network elements. Structure hinders incremental expansion. Other topologies have similar problems. How to solve this problem? Use random graph! Jellyfish Jellyfish Topology Jellyfish a degree-bounded random graph Jellyfish Topology Each switch uses some ports to connect to other switches, and uses the remaining ports for end hosts. It doesn't require every switch have the same port-count. If there are N switches in total, and all switches are identical: -- k-port switch -- r ports for other switches -- (k – r) ports for hosts then the network can supports N * (k – r) hosts. Jellyfish Topology How to construct a jellyfish? Simply pick a random pair of switches with free ports (for the switch-pairs are not already neighbors), join them with a link. Repeat until no further links can be added. How to add a new switch (i.e., incremental expansion)? Randomly remove an existing link (x, y). Add link (p1, x) and (p2, y), where p1 and p2 are two ports on the new switch. Jellyfish Properties Flexible. Incrementally constructed Jellyfish has the same capacity as Jellyfish built from scratch. Large throughput. -- achieves 91% of the theoretical optimal throughput -- leaves little room for improvement Low mean path length. Highly failure resilient. -- maintains its structure in the face of link or node failures -- a random graph with a few failures is just another random graph of slightly smaller size Jellyfish vs. Fat-tree Jellyfish vs. Fat-tree – Throughput Jellyfish can support 27% more servers at full capacity than a fattree with the same switching equipment (i.e., with the same cost). Jellyfish vs. Fat-tree – Path Length With 686 servers, >99.5% of source-destination pairs in Jellyfish can be reached in <6 hops, while the corresponding number is only 7.5% in the fat-tree. Jellyfish vs. Fat-tree – Failure Resilience Both topologies are highly resilient to failures. But the throughput per server decreases more gracefully for Jellyfish than for a sameequipment fat-tree as the percentage of failed links increases. Jellyfish vs. Fat-tree – Flexibility Arbitrary-sized Networks: Jellyfish: allows construction of arbitrary-size networks √ Fat-tree: the size is constrained by the port-count of switches × Incremental Expandability: Jellyfish: can be expanded easily, and the desirable properties are maintained: high bandwidth and short paths at low cost √ Fat-tree: structure hinders incremental expansion × Routing & Congestion Control Jellyfish Fat-tree Routing k-shortest-paths ECMP (equal cost multipath routing) Congestion Control MPTCP (multipath TCP ) MPTCP (multipath TCP ) References [1] Singla A, Hong C Y, Popa L, et al. Jellyfish: Networking data centers randomly[C]//Presented as part of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12). 2012: 225-238. [2] Al-Fares M, Loukissas A, Vahdat A. A scalable, commodity data center network architecture[J]. ACM SIGCOMM Computer Communication Review, 2008, 38(4): 63-74. Thanks!
© Copyright 2024 Paperzz