Communication - Cs.princeton.edu

P4 Applications
Jennifer Rexford
Fall 2016 (TTh 3:00-4:20 in CS 105)
COS 561: Advanced Computer Networks
http://www.cs.princeton.edu/courses/archive/fall16/cos561/
P4 Code Example
Simple Router
https://github.com/p4lang/tutorials/blob/master/
SIGCOMM_2016/heavy_hitter/p4src/heavy_hitter.p4
2
Simple Router
Processor
Switching
Fabric
Simple Router
smac dmac
Processor
ingress
port 0
nhop_ipv4
smac dmac
Switching
Fabric
egress
port 3
IP Prefix
Next-hop
1.2.3.0/24
2
5.6.0.0/16
3
7.0.0.0/8
4
Simple Router
• Parse the packet
– Ethernet and IPv4 headers
• IP forwarding
– Longest-prefix match on destination IP address
– … to determine the “next hop” port for the packet
• Update Ethernet frame
– Set the source and destination MAC addresses
– … to correspond to the output link
• Updates to the IPv4 header
– Verify and update the IP header checksum
– Decrement the IP “time to live” field
5
Headers: Ethernet and IPv4
header_type ethernet_t {
fields {
dstAddr : 48;
srcAddr : 48;
etherType : 16;
}
}
header_type ipv4_t {
fields {
version : 4;
ihl : 4;
…
ttl;
protocol;
hdrChecksum : 16;
srcAddr : 32;
dstAddr : 32;
}
}
6
Parser Definition: Ethernet and IPv4
parser start {
return parse_ethernet;
}
parser parse_ethernet {
extract(ethernet);
return select(latest.etherType) {
0x0800 : parse_ipv4;
default: ingress;
}
}
parser parse_ipv4 {
extract(ipv4);
return ingress;
}
7
Table Definition: IP Look-up
table ipv4_lpm {
reads {
ipv4.dstAddr : lpm;
}
actions {
set_nhop;
_drop;
Set nhop IP address
}
Set egress port
size: 1024;
Decrement TTL
}
8
Control Flow
control ingress {
apply(ipv4_lpm);
apply(forward);
}
control egress {
apply(send_frame);
}
Set dmac
Set smac
9
Prototyping New Functionality in P4
HULA Load-Sensitive Routing
10
Load Balancing Today
Equal Cost Multi-Path (ECMP) – hashing
Spine Switches
Leaf Switches
Servers
…
…
......
…
11
Alternatives Proposed
Central Controller
HyperV
Slow reaction
time
HyperV
12
Congestion-Aware Fabric
HyperV
HyperV
Congestion-aware Load
Balancing
CONGA – Cisco
Designed for 2-tier topologies
13
Programmable Data Planes
• Advanced switch architectures (P4 model)
–Programmable packet headers
–Stateful packet processing
• Applications
–In-band Network Telemetry (INT)
–HULA load balancer
• Examples
–Barefoot RMT, Intel Flexpipe, etc.
14
Programmable Switches: Capabilities
P4
Program
Compile
Memory
M
A
m1 a1
Ingress
Parser
Memory
Memory
Memory
M
A
M
A
M
m1
a1
m1
a1
m1 a1
Queue
Buffer
A
Egress
Deparser
15
Programmable Switches: Capabilities
P4
Program
Programmable
Parsing
Stateful
Memory
Compile
Memory
M
A
m1 a1
Ingress
Parser
Switch
Metadata
Memory
M
A
Memory
M
m1 a1
Queue
Buffer
A
m1 a1
Memory
M
A
m1 a1
Egress
Deparser
16
Hop-by-hop Utilization-aware Loadbalancing Architecture
1. HULA probes propagate path utilization
–Congestion-aware switches
2. Each switch remembers best next hop
–Scalable and topology-oblivious
3. Split elephants to mice flows (flowlets)
–Fine-grained load balancing
17
1. Probes carry path utilization
Spines
Probe
replicates
Aggregate
Probe
originates
ToR
18
1. Probes carry path utilization
P4 primitives
New header format
Programmable Parsing
Switch metadata
Spines
Probe
replicates
Aggregate
Probe
originates
ToR
19
1. Probes carry path utilization
ToR ID = 10
Max_util =
80%
ToR
S2 ID = 10
Max_util =
60%
ToR 10
ToR 1
S3
S1
Probe
S4
ToR ID = 10
Max_util =
50%
20
2. Switch identifies best downstream
path
ToR ID = 10
Max_util =
50%
S2
ToR 10
S3
S1
ToR 1
Probe
Dst
Best
hop
Path util
ToR 10
S4
50%
ToR 1
S2
10%
…
…
S4
Best hop table
21
2. Switch identifies best downstream
path
ToR ID = 10
Max_util =
40%
S2
ToR 10
S3
S1
ToR 1
Probe
Dst
Best hop
Path util
ToR 10
S4 S3
50% 40%
ToR 1
S2
10%
…
…
S4
Best hop table
22
3. Switches load balance flowlets
S2
ToR 10
Data
S1
ToR 1
Dest
Best
hop
Path util
ToR 10
S4
50%
ToR 1
S2
10%
…
…
S3
S4
Best hop table
23
3. Switches load balance flowlets
Flowlet table
Hash
flow
Dest
Timestamp
Next hop
ToR
10
1
S4
S2
…
…
…
…
Data
ToR 10
S1
ToR 1
Dest
Best
hop
Path util
ToR 10
S4
50%
ToR 1
S2
10%
…
…
S3
S4
Best hop table
24
3. Switches load balance flowlets
P4 primitives
RW access to stateful memory
Flowlet table
Comparison/arithmetic operators
Dest
Timestamp
Next hop
ToR
10
1
S4
S2
…
…
…
…
Data
ToR 10
S1
ToR 1
Dest
Best
hop
Path util
ToR 10
S4
50%
ToR 1
S2
10%
…
…
S3
S4
Best hop table
25
Discussion
• Load-sensitive routing
• Other P4 applications
• Abstractions for P4 programming
26