Being Cautiously Aggressive

BESS:
A Virtual Switch Tailored for NFV
Sangjin Han, Aurojit Panda, Brian Kim, Keon Jang, Joshua
Reich
Sai Edupuganti, Christian Maciocco, Sylvia Ratnasamy, Scott
Shenker
1
Why Another Virtual Switch?
• Does OpenVSwitch meet all the requirements for NFV?
1. Performance
NF
NF
NF
–
–
–
OVS (~1Mpps)  OVS-DPDK (~15Mpps)
cf. Vanilla DPDK (~59Mpps/core)
Packet I/O is only half of the problem
2. Flexibility
–
–
Custom actions?
Stateful packet processing?
3. Extensibility
–
–
Must enable NFV controller evolution
Easily add support for new/niche
protocols
2
Alternative Approach with BESS
• Modular pipeline as a dataflow graph
NF
NF
NF
• Each module can run arbitrary code
– Not limited by Match/Action semantics
– Independently extensible & optimizable
• Everything is programmable, not just
flow tables.
• You pay only for what you use.
– No performance cost for unused
features
3
BESS: Berkeley Extensible Software
Switch
• BESS is a programmable platform for vSwitch
dataplane
• Clean-slate internal architecture with NFV in mind
– Highly extensible & customizable
– Readily deployable with backward compatibility
– … all with extreme performance:
• Sub-microsecond latency
• Line-rate 40Gbps with min-sized packets on two cores
• (> 2x faster than other virtual switches)
4
BESS in E2
Per-server
datapath
configuration
High-level
NFV policy
VNF
VNF
E2
controller
E.g.,
VNF Connection
Forwarding Graph
VNF
BESS
HW NIC
HW NIC
Flow table
configuration
HW Switch(es)
5
BESS Architecture Overview
BESS daemon
(running in user space)
6
BESS Architecture Overview
Socket
APP
Native
APP
APP
OS
VM
APP
APP
OS
VM
APP
Linux VMM
Zerocopy
Container
DPDK VHost
DPDK
PMD
Legac
y
Host
BESS daemon
(running in user space)
7
BESS Architecture Overview
Socket
APP
Native
APP
APP
OS
VM
APP
APP
OS
VM
APP
Linux VMM
Zerocopy
Container
DPDK VHost
DPDK
PMD
Legac
y
Host
8
BESS Architecture Overview
Socket
APP
Controller
Native
APP
APP
OS
VM
APP
APP
OS
VM
APP
Linux VMM
Zerocopy
Container
DPDK VHost
DPDK
PMD
Legac
y
Host
9
Modular Datapath Pipeline
• External ports are interconnected with “modules” in a
dataflow graph (like the Click modular router).
– You can compose modules to implement your own datapath.
– Developing a new module is easy.
Port
Port
Port
Port
10
Resource-Aware CPU Scheduler
• BESS allows flexible scheduling policies for the data path.
– In terms of CPU utilization and bandwidth. Examples:
Port
Port
Port
Port
11
Resource-Aware CPU Scheduler
• BESS allows flexible scheduling policies for the data path.
– In terms of CPU utilization and bandwidth. Examples:
“Run on core 0”
Port
Port
Port
Port
“Run on core 1”
12
Resource-Aware CPU Scheduler
• BESS allows flexible scheduling policies for the data path.
– In terms of CPU utilization and bandwidth. Examples:
“Higher priority”
Port
Port
Port
Port
“Lower priority”
13
Resource-Aware CPU Scheduler
• BESS allows flexible scheduling policies for the data path.
– In terms of CPU utilization and bandwidth. Examples:
Port
Port
Port
Port
“Limit by
1 Mpps / 5 Gbps”
14
Resource-Aware CPU Scheduler
• BESS allows flexible scheduling policies for the data path.
– In terms of CPU utilization and bandwidth. Examples:
“Should not consume
more then 30% of CPU”
Port
Port
Port
Port
15
Control Interface
• JSON-like structured messages between
BESS and controller
• 3 ways to control the BESS datapath
– Python/C APIs
– Scriptable configuration language
– Cisco iOS-like CLI
• Everything is run-time configurable!
16
Control Interface
Port initialization
Controller /
operator
17
Control Interface
Controller /
operator
Module initialization
Dataflow setup
18
Control Interface
Module configuration
(e.g., flow table update)
Controller /
operator
Port stats
monitoring
19
Performance?
20
Minimum Framework Overhead
• Packet buffer allocation/deallocation
– ~10 CPU cycles per packet
• CPU scheduling
– ~50 CPU cycles per round
– Scales well with thousands of traffic classes
• Dynamic per-packet metadata attributes
– Zero instruction overhead for access
– Optimal CPU cache-line usage
21
Performance Evaluation
• OPNFV VSPERF usage models
22
1. Phy-Phy Performance
Tput @ 64B (Phy-vSwitch-Phy)
VM
(Fedora 21)
Soft-Switch (OvS-DPDK, BESS)
Fedora 21 / KVM
Port 4
Port 3
Port 2
Port 1
NIC (4 x 10GE - 40 Gigabit Ethernet)
Source: Irene Liew, Anbu Murugesan – Intel IOTG/NPG
Throughput (packets/second)
VM
(Fedora 21)
40G line-rate
60,000,000
50,000,000
40,000,000
30,000,000
20,000,000
10,000,000
0
1 core
2 cores
OVS-DPDK
4 cores
BESS
23
1. Phy-Phy Performance
Data sources:
- BESS, Vanilla DPDK, VPP: measured on a 2.6GHz Xeon E5-2650 v2 machine
- OVS, Linux L2/L3: Emmerich et al. “Performance Characteristics of Virtual Switching”, CloudNet 2014
- OVS-DPDK: Intel ONP 2.1 Performance Test Report
- mSwitch: (link bottlenecked w/ large batch sizes @ 3.2GHz) Honda et al. “mSwitch: A Highly-Scalable, Modular Software Switch”,
24 S
2. Phy-NF-Phy Performance
VM
bottleneck
OVS-DPDK and BESS Performance comparison for
64B packet size (Phy-vSwitch-VM)
Soft-switch (OvS-DPDK, BESS)
Fedora 21 / KVM
Port 4
Port 3
Port 2
Port 1
NIC (4 x 10GE - 40 Gigabit Ethernet)
Throughput (packets/second)
Port
Port
Port
Port
VM
(Fedora 21)
16,000,000
14,000,000
12,000,000
10,000,000
8,000,000
6,000,000
4,000,000
2,000,000
0
1 core
2 cores
OVS-DPDK
Source: Irene Liew, Anbu Murugesan – Intel IOTG/NPG
4 cores
BESS
25
3. Phy-NF-NF-Phy Performance
• BESS outperforms OVS-DPDK by a factor of 4-5x*
Port
VM
(Fedora 21)
Port
Port
Port
VM
(Fedora 21)
Soft-switch (OvS-DPDK, BESS)
Fedora 21 / KVM
* Source:
Intel ONP 2.1 Performance Test Report
Port 4
Port 3
Port 2
Port 1
NIC (4 x 10GE - 40 Gigabit Ethernet)
26
Multi-Core/Thread Scalability
27
Round-Trip Latency
NF
NF
DPDK
DPDK
RTT: 8.22us
HW NIC
HW NIC
NF
NF
BESS
DPDK
BESS
DPDK
HW NIC
HW NIC
RTT: 8.82us
-
Increase of 0.60us
(0.15us per
BESS traverse)
28
Summary
• BESS is an ideal vSwitch platform for NFV
– High performance
• Sub-microsecond latency/jitter
• Small packet 40Gbps throughput with only 1-2 cores
– Full flexibility and extensibility
• Available on GitHub:
https://github.com/netsys/bess
– Under BSD3 License
– ~30k lines in C and Python, supporting
• Linux 3.x / 4.x (x86_64), DPDK 16.04
• QEMU/KVM virtual machines, Docker/LXC containers
29