Segment Routing on Cisco Nexus 9500, 9300, 9200, 3200, and

White Paper
Segment Routing on Cisco Nexus
9500, 9300, 9200, 3200, and 3100
Platform Switches
Authors
Ambrish Mehta, Cisco Systems Inc.
Haider Salman, Cisco Systems Inc.
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 1 of 30
Contents
What You Will Learn ................................................................................................................................................ 3
Challenges in Data Center Networks ..................................................................................................................... 3
What Is Segment Routing? ..................................................................................................................................... 3
General MPLS Operations: PUSH, SWAP, and POP ............................................................................................. 3
Control Plane ........................................................................................................................................................... 4
MPLS Label Exchange with Peers ........................................................................................................................ 4
MPLS Label Allocation .......................................................................................................................................... 4
Prefix and Node Segment Identifiers ..................................................................................................................... 8
MPLS-Enabled Data Center Fabric ......................................................................................................................... 8
Traffic Steering Using Segment Routing ............................................................................................................. 11
Over-the-Top Service with Multihop BGP ............................................................................................................ 13
Layer 3 EVPN ......................................................................................................................................................... 17
Orchestration ......................................................................................................................................................... 28
Conclusion ............................................................................................................................................................. 29
For More Information ............................................................................................................................................. 30
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 2 of 30
What You Will Learn
®
This document provides a closer look at the segment routing available on Cisco Nexus 9500, 9300, 9200, 3200,
and 3100 platform data center switches. Its goal is to help network architects and engineers understand segment
routing technology and how it can be used to achieve application traffic engineering and various deployment
scenarios. It assumes that the reader has a high-level understanding of Cisco Nexus 9500, 9300, 9200, 3200, and
3100 platform data center switches, routing and Multiprotocol Label Switching (MPLS) concepts.
Challenges in Data Center Networks
Today’s data center network needs to be agile to meet the increasing demands of the new workloads constantly
being brought online. Many deployments are being implemented with Equal-Cost Multipath (ECMP) to make use of
available link capacity while adding redundancy in the end-to-end network. Although ECMP is generally a good
approach, it presents several challenges:
●
The application that originates the traffic has no control over the data-forwarding path through the network.
●
The hop-by-hop flow-forwarding decision to choose the next hop makes it prone to hotspots in the event of
link failure. Also, the application is not aware of this hotspot and will continue to send data traffic as if the
hotspot did not exist.
●
ECMP uses a hop-by-hop decision calculated by hashing at every node in the network, and the decision is
per flow. Large “elephant” flows can affect the performance of short-lived “mouse” flows.
●
Troubleshooting a data-plane drop with ECMP poses a unique set of challenges.
This document discusses segment routing as a technology, its benefits, and the deployment options that are
available.
What Is Segment Routing?
Segment routing uses a source routing model. A node steers a packet through an ordered list of instructions, called
segments. A segment can represent any instruction in a topology or service. A segment can have local semantics
for a segment routing node, or it can be global within a segment routing domain. The segment routing architecture
can be directly applied to the MPLS data plane with little to no change on the forwarding plane.
Before exploring segment routing and its components in detail, you should be familiar with general MPLS
terminology and operations.
General MPLS Operations: PUSH, SWAP, and POP
In a PUSH operation, a new label is affixed to the IP packet or to the MPLS label stack of the packet. Typically, the
ingress router (except in some traffic-engineering scenarios) performs this operation.
In a SWAP operation, the incoming label is replaced (swapped) with outgoing label, and the packet is forwarded to
the next hop as determined by the incoming label.
In a POP operation, the label is removed from the packet, which may reveal an inner label beneath it. If the popped
label was the last label on the label stack, the packet exits the MPLS domain. This process typically occurs at the
egress label switching router (LSR).
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 3 of 30
Control Plane
Segment routing relies on two critical control-plane elements: MPLS label exchange with peers and MPLS label
allocation.
MPLS Label Exchange with Peers
To support segment routing on the Cisco Nexus 9500, 9300, 9200, 3200, and 3100 platforms, Cisco® NX-OS
Software has been enhanced to support a new Border Gateway Protocol (BGP) address family (AF): the IPv4
labeled-unicast (LU) address family (LUAF). The BGP LUAF capability between LSRs facilitates label exchange
through BGP update messages. RFC 3107 specifies the way in which the label mapping information is carried in a
BGP update message (Figure 1). Label distribution is carried in the BGP update message by using the BGP-4
Multiprotocol Extensions attribute (RFC 2283). The label is encoded in the Network Layer Reachability Information
(NLRI) field of the attribute, and the Subsequent Address Family Identifier (SAFI) field is used to indicate that the
NLRI includes a label (SAFI value of 4).
Figure 1.
MPLS Label Exchange Through BGP Update
MPLS Label Allocation
Every MPLS-enabled switch needs to allocate an MPLS label so that it can be associated with IP prefixes. The
MPLS label can be any value between 16 and 471804. The switch can be configured with minimum and maximum
values to define the label range.
A(config)# mpls label range ?
<16-471804>
Minimum label value…
Two approaches to label allocation are available:
●
Allocate the MPLS label from the entire label range. In this case, label allocation relies on selection of an
available label from the configured MPLS label range on the switch. This label is then associated with the IP
prefix.
To control the prefix to which a label should be allocated, use the following command in the BGP
configuration. A route map can be used for more precise control.
...
router bgp <#>
…
address-family ipv4 unicast
…
allocate-label route-map <name>
…
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 4 of 30
Figure 2.
MPLS Label Allocation Using Dynamic Range
As shown in Figure 2, label details for prefix 172.0.11.0/24 are exchanged between nodes A, B, and C.
Node C learns the 172.0.11.0/24 IP prefix from an upstream switch as a general IP prefix. On node C, using
the allocate-label command shown in the preceding code, you can allocate an MPLS label. Because node
C originates this label, it allocates the Implicit-NULL label and sends it to downstream neighbor node B via
BGP update. Node B performs a similar label allocation process and sends the label using BGP update to
node A. Because node B is the penultimate hop, it programs the Out label as POP, and the In label is
dynamically allocated. Node A programs the Out label using the label value received in the BGP update
from node B. Node A allocates the local MPLS label as well, which is the In label.
The following output shows dynamically allocated labels on one of the LSRs. Prefix 170.0.11.0/24 is
received on LSR A through BGP update from BGP neighbor B with label value 116, which is programmed
as the Out label. Also, LSR A allocates the MPLS label, which is programmed as the In label. The In label is
allocated from the available label space from the overall MPLS label range (configured using the MPLS
label range command at the command-line interface [CLI]). This label is locally significant.
A# show mpls switching 170.0.11.0/24
Legend:
(P)=Protected, (F)=FRR active, (*)=more labels in stack.
IPV4:
In-Label
Out-Label
FEC name
Out-Interface
Next-Hop
126
116
172.0.11.0/24
Eth1/1
10.0.20.0
...
A# show bgp ipv4 labeled-unicast 170.0.11.0/24
BGP routing table information for VRF default, address family IPv4 Label
Unicast
BGP routing table entry for 170.0.11.0/24, version 145097
Paths: (1 available, best #1)
Flags: (0x20c001a) on xmit-list, is in urib, is best urib route, is in HW, has
label
label af: version 147466, (0x100002) on xmit-list
local label: 126
Advertised path-id 1, Label AF advertised path-id 1
Path type: internal, path is valid, is best path, no labeled nexthop, in rib
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 5 of 30
AS-Path: 100 , path sourced external to AS
10.0.20.0 (metric 0) from 10.0.20.0 (1.1.1.203)
Origin incomplete, MED 0, localpref 100, weight 0
Received label 116
...
Although this approach to MPLS label allocation can do the job of associating an MPLS label with an IP
prefix, for a large data center network with hundreds of devices, managing and troubleshooting any MPLS
forwarding problem can be challenging. Also, with almost every node allocating a different label for the
same prefix, use of a controller or some similar mechanism for label management is almost impossible.
To address these challenges, another approach is available.
●
The segment routing global block (SRGB) provides an alternative approach to MPLS label allocation.
NX-OS supports SRGB.
The SRGB is a subset of the overall MPLS label range defined on the switch. The default SRGB range is
16000 to 23999; however, this range can be changed through the global-block CLI command under
segment-routing mpls in configuration mode.
...
A#config terminal
A(config)# segment-routing mpls
A(config-segment-routing-mpls)# global-block 16000 23999
...
The SRGB should be configured on every node in the network. For simplicity and ease of operation, all
nodes should be configured with same SRGB values.
Another parameter that plays a role in segment routing along with SRGB is label-index, which is also
carried as part of the BGP update message. The label-index parameter should be associated with the
prefix on the originating switch as part of the BGP configuration.
C#router bgp <#>
…
address-family ipv4 unicast
network 170.0.1.0/24 route-map ADD_2000
…
route-map ADD_2000 permit 10
set label-index 2000
…
Figure 3.
MPLS Label Allocation Using SRGB
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 6 of 30
As shown in Figure 3 and highlighted in the following output, a label index of 2000 is received as part of the
BGP update message between BGP peers. Adding the label index to the starting value of the SRGB
calculates the local label. In this case, the calculation is 16000 + 2000 = 18000; hence, a local label of
18000 is allocated. Also, if the SRGB is the same across all nodes in the network, then the received label
and local label values will be same. In this case, nodes A and B have the same label.
A# show bgp ipv4 labeled-unicast 170.0.1.0/024
BGP routing table information for VRF default, address family IPv4 Label
Unicast
BGP routing table entry for 170.0.1.0/24, version 145087
Paths: (1 available, best #1)
Flags: (0x20c001a) on xmit-list, is in urib, is best urib route, is in HW, has
label
label af: version 147456, (0x100002) on xmit-list
local label: 18000
Advertised path-id 1, Label AF advertised path-id 1
Path type: internal, path is valid, is best path, no labeled nexthop, in rib
AS-Path: 100 , path sourced external to AS
10.0.20.0 (metric 0) from 10.0.20.0 (1.1.1.203)
Origin IGP, MED not set, localpref 100, weight 0
Received label 18000
Prefix-SID Attribute: Length: 10
Label Index TLV: Length 7, Flags 0x0 Label Index 2000
A# show mpls switching 172.0.1.0/24
Legend:
(P)=Protected, (F)=FRR active, (*)=more labels in stack.
IPV4:
In-Label
Out-Label
FEC name
Out-Interface
Next-Hop
18000
18000
172.0.1.0/24
Eth1/1
10.0.20.0
With the label index being carried in the BGP update message and all nodes having the same SRGB configuration,
the prefix will have the same In label and Out label across the entire network. This approach makes provisioning
using any outside entity (such as a controller) much easier, and the configuration is simple to troubleshoot as well.
Thus, although the use of different SRGB configurations on different nodes in the network is supported, this
approach is not recommended because of the complexity in provisioning and troubleshooting.
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 7 of 30
Prefix and Node Segment Identifiers
A prefix segment identifier (SID) is an MPLS label attached to an IP prefix. Typically, the prefix SID is advertised by
top-of-the-rack (ToR) switches using network statements (as shown in following configuration). The prefix SID
represents a subnet on which hosts are provisioned behind the ToR switch. With the same SRGB configuration
used on all nodes in the network, the MPLS label associated with this prefix will be the same as well.
…
route-map ADD_2000 permit 10
set label-index 2000
…
router bgp 100
…
address-family ipv4 unicast
network 172.0.1.0/24 route-map ADD_2000
…
A node SID is an MPLS label attached to an IP prefix associated with a given node in the network. For example, a
loopback interface configured on a node is advertised in BGP through a network statement as a prefix SID. In
terms of control-plane propagation and the forwarding plane, the node SID is the same as the prefix SID. With the
same SRGB configuration used on all nodes in the network, the MPLS label associated with this prefix will be
same as well. The node SID is used primarily to identify a given node in the network with an MPLS label. The node
SID is used in several use cases, which are explored in detail later in this document.
MPLS-Enabled Data Center Fabric
Figure 4 shows an MPLS-enabled data center fabric with a three-layer data center network.
●
Node E is a ToR switch in BGP AS 100, and it is advertising the 172.0.1.0/24 subnet as part of MPLS.
●
Nodes B, C, and D are leaf switches in BGP AS 200.
●
Node A is a spine switch in BGP AS 300.
Figure 4.
MPLS Enabled Data Center Fabric
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 8 of 30
For all nodes, BGP peering with the LUAF capability negotiated.
The following sample configuration between node E and node D establishes BGP neighbors and exchanges LUAF
capabilities. On node E, prefix 172.0.1.0/24 is advertised through the network statement, and a label index of 10 is
set through a route map.
!Node E
!Node D
interface ethernet1/1
interface ethernet1/1
ip address 10.0.15.1/31
ip address 10.0.15.0/31
...
...
router bgp 100
router bgp 200
...
...
address-family ipv4 unicast
template peer AF-LABEL
network 172.0.1.0/24 route-map
ADD_LABEL_INDEX
address-family ipv4 labeledunicast
...
...
template peer AF-LABEL
neighbor 10.0.15.1
address-family ipv4 labeled-unicast
inherit peer AF-LABEL
...
remote-as 100
neighbor 10.0.15.0
update-source ethernet1/1
inherit peer AF-LABEL
...
remote-as 200
update-source ethernet1/1
...
route-map ADD_LABEL_INDEX permit 10
set label-index 10
D# show bgp ipv4 labeled-unicast summary
BGP summary information for VRF default, address family IPv4 Label Unicast
…
Neighbor
V
AS
MsgRcvd MsgSent
10.0.15.1
4
100 544
502
TblVer
InQ OutQ Up/Down
32937
0
0
State/PfxRcd
07:08:49 1
...
D# show bgp ipv4 labeled-unicast 172.0.1.0/24
BGP routing table information for VRF default, address family IPv4 Label
Unicast
BGP routing table entry for 172.0.1.0/24, version 74148
Paths: (1 available, best #1)
Flags: (0x20c001a) on xmit-list, is in urib, is best urib route, is in
HW, has label
label af: version 74170, (0x100002) on xmit-list
local label: 16010
making it 16010
< SRGB base value of 16000 + Label Index of 10 is
Advertised path-id 1, Label AF advertised path-id 1
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 9 of 30
Path type: external, path is valid, received and used, is best path, no
labeled nexthop, in rib
AS-Path: 100 , path sourced external to AS
10.0.15.0 (metric 0) from 10.0.15.0 (1.1.1.1)
Origin IGP, MED not set, localpref 100, weight 0
Received label 3 < This means that Node D is Penultimate Hop for
prefix 172.0.1.0/24
Prefix-SID Attribute: Length: 10
Label Index TLV: Length 7, Flags 0x0 Label Index 10
the label index we received
< This is
D# show mpls switching 172.0.1.0/24
Legend:
(P)=Protected, (F)=FRR active, (*)=more labels in stack.
IPV4:
In-Label
Out-Label
FEC name
Out-Interface
16010
Pop Label
172.0.1.0/24
Eth1/1
Next-Hop
10.0.15.0
Output from Node A
A# show bgp ipv4 labeled-unicast 172.0.1.0/24
BGP routing table information for VRF default, address family IPv4 Label
Unicast
BGP routing table entry for 172.0.1.0/24, version 12477
Paths: (16 available, best #1)
Flags: (0x20c001a) on xmit-list, is in urib, is best urib route, is in
HW, has label
label af: version 18842, (0x100002) on xmit-list
local label: 16010 < Again SRGB base value + label index giving us same
label.
Advertised path-id 1, Label AF advertised path-id 1
Path type: external, path is valid, received and used, is best path, no
labeled nexthop, in rib
AS-Path: 1 100 , path sourced external to AS
10.0.5.36 (metric 0) from 10.0.5.36 (1.1.1.201)
Origin IGP, MED not set, localpref 100, weight 0
Received label 16010 < This was the label allocated by D, which is
sent to A.
Prefix-SID Attribute: Length: 10
Label Index TLV: Length 7, Flags 0x0 Label Index 10
index is carried in BGP update
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
< Same label
Page 10 of 30
A-sys08-eor3# show mpls switching 172.0.1.0/24
Legend:
(P)=Protected, (F)=FRR active, (*)=more labels in stack.
IPV4:
In-Label
Out-Label
FEC name
Out-Interface
Next-Hop
16010
16010
172.0.1.0/24
Eth1/5
10.0.5.36
16010
16010
172.0.1.0/24
Eth1/6
10.0.5.38
16010
16010
172.0.1.0/24
Eth1/7
10.0.5.40
A# show ip route 172.0.1.0/24
IP Route Table for VRF "default"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>
172.0.1.0/24, ubest/mbest: 3/0
< This is 3-way ECMP via B, C, and D
*via 10.0.5.36, [20/0], 07:25:30, bgp-300, external, tag 200 (mpls)
*via 10.0.5.38, [20/0], 07:25:30, bgp-300, external, tag 200 (mpls)
*via 10.0.5.40, [20/0], 07:25:30, bgp-300, external, tag 200 (mpls)
A#
Traffic Steering Using Segment Routing
Segment routing architecture allows an application to steer a packet flow through any topology and service chain
by using source routing. Segment routing is fundamental to providing end-to-end policy, scalability, functions, and
simplicity.
A typical use case for segment routing is traffic steering across an explicit path as the BGP-LU control plane
establishes segment routing forwarding paths on specific nodes. Labels stacks can be allocated at the hosts
through an external controller or other similar mechanism. Stacking labels at the host allows path splicing, as
explained in the following section.
The topology in Figure 5 shows a three-tier data center design with ToR, leaf, and spine switches. For every node,
there is also an associated node SID.
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 11 of 30
Figure 5.
Traffic Steering Using Segment Routing
Switch
Node SID
E
16001
D
16002
C
16003
B
16004
A
16005
For data traffic from ToR switch E to spine switch A, you can see how to steer the traffic with an explicit path
embedded in the application through the label stack.
Next consider the case in which you want data traffic to follow a path from node E to node D to node A. As shown
in Figure 6, two labels are inserted into the MPLS header in the application, with the top label being the node SID
of switch D. When node E performs an MPLS forwarding operation, it will pop the top label (16002) because it is
penultimate hopfor node D. This packet will be sent to node D. When node D performs the MPLS forwarding
function, it sees that the MPLS label is for node A, for which it is penultimate hop, so it will pop the MPLS label and
send the payload to node A.
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 12 of 30
Figure 6.
Traffic Steering Using Segment Routing Data Path
Similarly, the following label stack sent from the application will steer traffic along a path from node E to node C to
node A (Figure 7).
Figure 7.
Traffic Steering Using Segment Routing Data Path
Over-the-Top Service with Multihop BGP
As shown in previous section, data center nodes have redundancy at every level to protect against link and device
failures. With redundancy in place at various locations in the network, any MPLS prefix, also known as forwarding
equivalence class (FEC) will be reachable through more than one path and will be seen as ECMP. For each FEC,
a unique MPLS label needs to be pushed or swapped. With this approach, you need to maintain a separate ECMP
object for each FEC, and at some point you will reach the hardware resource limit for ECMP objects.
Previously, this document discussed the way that the control plane and data plane works with an MPLS prefix
(FEC).
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 13 of 30
Although it is good to propagate the same FEC information in all nodes, in some scenarios the transit nodes don’t
need to know about all the FECs. This requirement depends on hardware resource use, and operationally it can be
achieved in several ways.
To facilitate this approach, you can use an overlay with BGP peering and an MPLS underlay. The MPLS underlay
is built with hop-by-hop BGP-LU neighbors. For the overlay, you would use multihop BGP peering between two
endpoints (typically loopback interfaces of the nodes). These endpoints are learned through the MPLS underlay.
For example, in the topology in Figure 8, node A (spine switch) and node E (ToR switch) are advertising their
loopback interfaces in the MPLS underlay. Using this loopback address, multihop BGP peering is performed
between them. This multihop BGP peering negotiates IPv4 address family unicast capabilities.
Figure 8.
Over-the-Top-Service with Multihop BGP
With this multihop BGP session in place between nodes E and A, IP prefixes can be exchanged directly between
them as plain IPv4 prefixes with the BGP peering endpoint as the next hop. This peering endpoint is learned
through the MPLS underlay using hop-by-hop BGP neighbors with LUAF. Hence, when this route is recursively
programmed into the hardware, it is programmed as an MPLS route. Transit nodes B, C, and D do not learn these
prefixes. This approach typically is used when MPLS-enabled services to and from the outside need to be used. In
this case, only one MPLS label is used to service multiple prefixes.
!Node E
!Node A
interface loopback1
interface loopback1
ip address 1.1.1.204/32
ip address 1.1.1.200/32
…
…
router bgp 100
router bgp 300
...
...
address-family ipv4 unicast
address-family ipv4 unicast
!Advertise loopback1 in MPLS with
label-index
!Advertise loopback1 in MPLS with
label-index
network 1.1.1.204/32 route-map
ADD_1
network 1.1.1.200/32 route-map
ADD_2
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 14 of 30
...
...
template peer MULTI-HOP-BGP-V4
template peer MULTI-HOP-BGP-V4
ebgp-multihop 3
ebgp-multihop 3
address-family ipv4 unicast
address-family ipv4 unicast
!Below route policy is needed to
avoid underlay
!Below route policy is needed to
avoid underlay
!prefixes going over overlay
!prefixes going over overlay
route-map AVOID-LOOP out
route-map AVOID-LOOP out
…
…
neighbor 1.1.1.200
neighbor 1.1.1.204
inherit peer MULTI-HOP-BGP-V4
inherit peer MULTI-HOP-BGP-V4
remote-as 300
remote-as 100
update-source loopback1
update-source loopback1
...
...
route-map AVOID-LOOP deny 10
route-map AVOID-LOOP deny 10
match ip address prefix-list
MATCH_MPLS_PFX
match ip address prefix-list
MATCH_MPLS_PFX
route-map AVOID-LOOP permit 20
route-map AVOID-LOOP permit 20
...
...
!Avoid advertising loopback used for
peering over the peering session itself
!Avoid advertising loopback used for
peering over the peering session itself
ip prefix-list MATCH_MPLS_PFX seq 5
permit 1.1.1.200/32
ip prefix-list MATCH_MPLS_PFX seq 5
permit 1.1.1.200/32
ip prefix-list MATCH_MPLS_PFX seq 10
permit 1.1.1.204/32
ip prefix-list MATCH_MPLS_PFX seq 10
permit 1.1.1.204/32
!Avoid MPLS underlay routes going over
overlay
!Avoid MPLS underlay routes going over
overlay
ip prefix-list MATCH_MPLS_PFX seq 15
permit 172.0.1.0/24
ip prefix-list MATCH_MPLS_PFX seq 15
permit 172.0.1.0/24
...
...
The following example shows a route learned on node A from the multihop BGP neighbor of node E.
A# show ip bgp summary
BGP summary information for VRF default, address family IPv4 Unicast
BGP router identifier 1.1.1.200, local AS number 300
BGP table version is 58787, IPv4 Unicast config peers 3, capable peers 1
10150 network entries and 12065 paths using 2308060 bytes of memory
BGP attribute entries [15/2340], BGP AS path entries [1/10]
BGP community entries [0/0], BGP clusterlist entries [0/0]
12032 received paths for inbound soft reconfiguration
12032 identical, 0 modified, 0 filtered received paths using 0 bytes
Neighbor
V
1.1.1.204
4
AS MsgRcvd MsgSent
100
433
388
TblVer
58787
InQ OutQ Up/Down
0
0
State/PfxRcd
06:22:19
10000
A#
A# show ip bgp 71.0.0.0/24
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 15 of 30
BGP routing table information for VRF default, address family IPv4 Unicast
BGP routing table entry for 71.0.0.0/24, version 48788
Paths: (1 available, best #1)
Flags: (0x08001a) on xmit-list, is in urib, is best urib route, is in HW
label af: version 66266, (0x100002) on xmit-list
Multipath: eBGP iBGP
Advertised path-id 1, Label AF advertised path-id 1
Path type: external, path is valid, received and used, is best path, no labeled
nexthop, in rib
AS-Path: 100 64101 , path sourced external to AS
1.1.1.204 (metric 0) from 1.1.1.204 (1.1.1.1)
Origin IGP, MED not set, localpref 100, weight 0
Path-id 1 not advertised to any peer
Label AF advertisement
Path-id 1 not advertised to any peer
A#
A# show ip route 71.0.0.0
IP Route Table for VRF "default"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>
71.0.0.0/24, ubest/mbest: 1/0
*via 1.1.1.204, [20/0], 06:07:19, bgp-300, external, tag 100
A-sys08-eor3# show ip route 1.1.1.204
IP Route Table for VRF "default"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>
1.1.1.204/32, ubest/mbest: 16/0
*via 10.0.5.36, [20/0], 06:23:28, bgp-300, external, tag 200 (mpls)
*via 10.0.5.38, [20/0], 06:23:28, bgp-300, external, tag 200 (mpls)
...
A#
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 16 of 30
Layer 3 EVPN
Beginning with Cisco NX-OS Release 7.0(3)I6(1), you can configure EVPN over segment routing MPLS.
EVPN is control plane that has been used in virtualized DC. It’s becoming a common control plane for L2 and L3
services into the DC as it supports multiple data plane encapsulation as it uses the same traditional building blocks:
Route-Target (RT), Route-Distinguisher (RD), and VRFs.
EVPN family introduces next generation solutions for Ethernet services as BGP takes the role of the control plane
for Ethernet Segment and MAC distribution learning over MPLS and VXLAN data-plane
EVPN over segment routing MPLS offers the following main benefits:
●
Multi-tenant, scalable, high performance data center
●
Provides common operation models across DC & WAN
●
Seamless transport with SR & efficient control plane with EVPN
BGP EVPN Route type
Type 1: Ethernet autodiscovery (EAD) route
Type 2: MAC and MAC-IP route advertisements
Type 3: Inclusive multicast route
Type 4: Ethernet segment route
Type 5: IP prefix route
With EVPN over segment routing MPLS, there are 2 parts, L2 and L3. With Cisco NX-OS Release 7.0(3)I6(1), we
will be supporting L3 only which means it’s a Type-5 route which is IP/Prefix route.
The IP prefix routes (Type-5) are:
●
Type-5 route with VXLAN encapsulation
RT-5 Route – IP Prefix
RD: L3 RD
IP Length: prefix length
IP address: IP
Label1:
L3VNI
Route Target
RT for IP-VRF
Tunnel Type VxLAN
Router MAC
●
Type-5 route with MPLS encapsulation
RT-5 Route – IP Prefix
RD: L3 RD
IP Length: prefix length
IP address: IP
Label1:
BGP MPLS Label
Route Target
RT for IP-VRF
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 17 of 30
Layer 3 EVPN over Segment Routing MPLS Configuration
Table 1 outlines a sample configuration for enabling L3 EVPN over segment routing
Command or Action
Purpose
feature BGP
Enables BGP feature and BGP configurations
install feature-set mpls
Enables MPLS configuration commands
feature-set mpls
Enables MPLS configuration commands
feature mpls segment-routing
Enables Segment Routing configuration commands
feature mpls evpn
Enables EVPN over MPLS configuration commands
Sample Configuration
IBGP Network with route reflector
In the above topology, we have a BGP SR session over the physical interfaces forming the Segment Routing
underlay and a BGP EVPN session over the loopback of the nodes. Route- Reflectors are deployed for scaling
purposes and optionally user can use eBGP for overlay peering.
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 18 of 30
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 19 of 30
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 20 of 30
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 21 of 30
Layer 3 EVPN over SR Functionality Verification
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 22 of 30
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 23 of 30
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 24 of 30
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 25 of 30
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 26 of 30
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 27 of 30
Orchestration
The segment routing configuration can be pushed to the switch using the orchestration tool. Orchestration can be
performed using the Cisco NX-API and the representational state transfer (REST) API. The following configuration
example shows how to enable segment routing using the NX-API.
The first step is to enable the NX-API feature on the switch. If it is not already configured, you need to configure the
management IP address and the username and password.
A#config t
A(config)#feature nx-api
A(config)#interface mgmt 0
A(config-if)#ip address 172.31.203.123/24
A(config-if)#no shut
A(config-if)#exit
A(config)#username administrator password cisco123
A(config)#end
A#
Next, you push the configuration to the switch.
You can do this from the web interface (Figure 9). You need to type the command as when you configure a switch
through the CLI. Then you click POST to push the configuration to the switch. This interface will also generate
pseudocode, which you can then copy to the clipboard and use as part of the script.
Figure 9.
Configuration via NX-API Using Web Interface
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 28 of 30
The following Python script was built using the pseudocode generated from the web interface.
import requests
import json
url='http://172.31.203.123/ins'
switchuser='administrator'
switchpassword='cisco123'
myheaders={'content-type':'application/json-rpc'}
payload=[
{
"jsonrpc": "2.0",
"method": "cli",
"params": {
"cmd": "config t",
"version": 1
},
"id": 1
},
{
"jsonrpc": "2.0",
"method": "cli",
"params": {
"cmd": "segment-routing mpls",
"version": 1
},
"id": 2
}
]
response = requests.post(url,data=json.dumps(payload,
headers=myheaders,auth=(switchuser,switchpassword)).json()
Conclusion
Segment routing provides a flexible forwarding framework to support growing network infrastructure needs. It uses
simple extensions to standardized BGP for the control plane, thereby eliminating the complexity and performance
and scale limitations of Label Distribution Protocol (LDP) and Resource Reservation Protocol (RSVP).
Segment routing can easily be added on top of existing MPLS forwarding infrastructure. Traffic engineering can
easily be achieved without the need to maintain additional states in data center switches.
Segment routing addresses WAN, enterprise, and data center needs all at the same time. It thus provides the
opportunity to deliver end-to-end traffic engineering through a single operational model, and it allows
application-based data path enforcement, making it an excellent choice for software-defined networking (SDN).
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
Page 29 of 30
For More Information
For additional information, see the following resources:
●
See the blog about segment routing for the data center at https://blogs.cisco.com/datacenter/applicationlevel-intelligence-in-the-data-center-using-segment-routing.
●
For more information about the Cisco Nexus 9000 Series Switches, see the detailed product information at
the product homepage at https://www.cisco.com/c/en/us/products/switches/nexus-9000-seriesswitches/index.html.
●
For more information about the Cisco Nexus 3000Series Switches, see the detailed product information at
the product homepage at https://www.cisco.com/c/en/us/products/switches/nexus-3000-seriesswitches/index.html.
●
For more information about segment routing, visit https://www.segment-routing.net/.
Printed in USA
© 2017 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information.
C11-737536-01
07/17
Page 30 of 30