draft-ietf-bess-evpn-optimized-ir-04.txt | draft-ietf-bess-evpn-optimized-ir-05.txt | |||
---|---|---|---|---|
skipping to change at page 1, line 16 ¶ | skipping to change at page 1, line 16 ¶ | |||
W. Lin | W. Lin | |||
Juniper | Juniper | |||
M. Katiyar | M. Katiyar | |||
Versa Networks | Versa Networks | |||
A. Sajassi | A. Sajassi | |||
Cisco | Cisco | |||
Expires: April 3, 2019 September 30, 2018 | Expires: April 21, 2019 October 18, 2018 | |||
Optimized Ingress Replication solution for EVPN | Optimized Ingress Replication solution for EVPN | |||
draft-ietf-bess-evpn-optimized-ir-04 | draft-ietf-bess-evpn-optimized-ir-05 | |||
Abstract | Abstract | |||
Network Virtualization Overlay (NVO) networks using EVPN as control | Network Virtualization Overlay (NVO) networks using EVPN as control | |||
plane may use ingress replication (IR) or PIM-based trees to convey | plane may use Ingress Replication (IR) or PIM (Protocol Independent | |||
the overlay BUM traffic. PIM provides an efficient solution to avoid | Multicast) based trees to convey the overlay BUM traffic. PIM | |||
sending multiple copies of the same packet over the same physical | provides an efficient solution to avoid sending multiple copies of | |||
link, however it may not always be deployed in the NVO core network. | the same packet over the same physical link, however it may not | |||
IR avoids the dependency on PIM in the NVO network core. While IR | always be deployed in the NVO core network. IR avoids the dependency | |||
provides a simple multicast transport, some NVO networks with | on PIM in the NVO network core. While IR provides a simple multicast | |||
demanding multicast applications require a more efficient solution | transport, some NVO networks with demanding multicast applications | |||
without PIM in the core. This document describes a solution to | require a more efficient solution without PIM in the core. This | |||
optimize the efficiency of IR in NVO networks. | document describes a solution to optimize the efficiency of IR in NVO | |||
networks. | ||||
Status of this Memo | Status of this Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
other groups may also distribute working documents as Internet- | other groups may also distribute working documents as Internet- | |||
Drafts. | Drafts. | |||
skipping to change at page 2, line 13 ¶ | skipping to change at page 2, line 14 ¶ | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html | http://www.ietf.org/shadow.html | |||
This Internet-Draft will expire on April 3, 2019. | This Internet-Draft will expire on April 21, 2019. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2018 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
2. Solution requirements . . . . . . . . . . . . . . . . . . . . . 4 | 2. Terminology and Conventions . . . . . . . . . . . . . . . . . . 4 | |||
3. EVPN BGP Attributes for optimized-IR . . . . . . . . . . . . . 5 | 3. Solution requirements . . . . . . . . . . . . . . . . . . . . . 5 | |||
4. Non-selective Assisted-Replication (AR) Solution Description . 8 | 4. EVPN BGP Attributes for optimized-IR . . . . . . . . . . . . . 6 | |||
4.1. Non-selective AR-REPLICATOR procedures . . . . . . . . . . 8 | 5. Non-selective Assisted-Replication (AR) Solution Description . 9 | |||
4.2. Non-selective AR-LEAF procedures . . . . . . . . . . . . . 9 | 5.1. Non-selective AR-REPLICATOR procedures . . . . . . . . . . 10 | |||
4.3. RNVE procedures . . . . . . . . . . . . . . . . . . . . . . 11 | 5.2. Non-selective AR-LEAF procedures . . . . . . . . . . . . . 11 | |||
4.4. Forwarding behavior in non-selective AR EVIs . . . . . . . 11 | 5.3. RNVE procedures . . . . . . . . . . . . . . . . . . . . . . 12 | |||
4.4.1. Broadcast and Multicast forwarding behavior . . . . . . 11 | 5.4. Forwarding behavior in non-selective AR EVIs . . . . . . . 13 | |||
4.4.1.1. Non-selective AR-REPLICATOR BM forwarding . . . . . 11 | 5.4.1. Broadcast and Multicast forwarding behavior . . . . . . 13 | |||
4.4.1.2. Non-selective AR-LEAF BM forwarding . . . . . . . . 12 | 5.4.1.1. Non-selective AR-REPLICATOR BM forwarding . . . . . 13 | |||
4.4.1.3. RNVE BM forwarding . . . . . . . . . . . . . . . . 12 | 5.4.1.2. Non-selective AR-LEAF BM forwarding . . . . . . . . 14 | |||
4.4.2. Unknown unicast forwarding behavior . . . . . . . . . . 13 | 5.4.1.3. RNVE BM forwarding . . . . . . . . . . . . . . . . 14 | |||
4.4.2.1. Non-selective AR-REPLICATOR/LEAF Unknown unicast | 5.4.2. Unknown unicast forwarding behavior . . . . . . . . . . 14 | |||
forwarding . . . . . . . . . . . . . . . . . . . . 13 | 5.4.2.1. Non-selective AR-REPLICATOR/LEAF Unknown unicast | |||
4.4.2.2. RNVE Unknown unicast forwarding . . . . . . . . . . 13 | forwarding . . . . . . . . . . . . . . . . . . . . 15 | |||
5. Selective Assisted-Replication (AR) Solution Description . . . 13 | 5.4.2.2. RNVE Unknown unicast forwarding . . . . . . . . . . 15 | |||
5.1. Selective AR-REPLICATOR procedures . . . . . . . . . . . . 14 | ||||
5.2. Selective AR-LEAF procedures . . . . . . . . . . . . . . . 15 | 6. Selective Assisted-Replication (AR) Solution Description . . . 15 | |||
5.3. Forwarding behavior in selective AR EVIs . . . . . . . . . 16 | 6.1. Selective AR-REPLICATOR procedures . . . . . . . . . . . . 15 | |||
5.3.1. Selective AR-REPLICATOR BM forwarding . . . . . . . . . 16 | 6.2. Selective AR-LEAF procedures . . . . . . . . . . . . . . . 17 | |||
5.3.2. Selective AR-LEAF BM forwarding . . . . . . . . . . . . 17 | 6.3. Forwarding behavior in selective AR EVIs . . . . . . . . . 18 | |||
6. Pruned-Flood-Lists (PFL) . . . . . . . . . . . . . . . . . . . 18 | 6.3.1. Selective AR-REPLICATOR BM forwarding . . . . . . . . . 18 | |||
6.1. A PFL example . . . . . . . . . . . . . . . . . . . . . . . 18 | 6.3.2. Selective AR-LEAF BM forwarding . . . . . . . . . . . . 19 | |||
7. AR Procedures for single-IP AR-REPLICATORS . . . . . . . . . . 19 | 7. Pruned-Flood-Lists (PFL) . . . . . . . . . . . . . . . . . . . 20 | |||
8. AR Procedures and EVPN All-Active Multi-homing Split-Horizon . 20 | 7.1. A PFL example . . . . . . . . . . . . . . . . . . . . . . . 20 | |||
8.1. Ethernet Segments on AR-LEAF nodes . . . . . . . . . . . . 20 | 8. AR Procedures for single-IP AR-REPLICATORS . . . . . . . . . . 21 | |||
8.2. Ethernet Segments on AR-REPLICATOR nodes . . . . . . . . . 21 | 9. AR Procedures and EVPN All-Active Multi-homing Split-Horizon . 22 | |||
9. Benefits of the optimized-IR solution . . . . . . . . . . . . . 21 | 9.1. Ethernet Segments on AR-LEAF nodes . . . . . . . . . . . . 22 | |||
10. Conventions used in this document . . . . . . . . . . . . . . 21 | 9.2. Ethernet Segments on AR-REPLICATOR nodes . . . . . . . . . 23 | |||
11. Security Considerations . . . . . . . . . . . . . . . . . . . 22 | 10. Benefits of the optimized-IR solution . . . . . . . . . . . . 23 | |||
12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 | 11. Security Considerations . . . . . . . . . . . . . . . . . . . 24 | |||
13. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 22 | 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 | |||
14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 | 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 | |||
14.1 Normative References . . . . . . . . . . . . . . . . . . . 23 | 13.1 Normative References . . . . . . . . . . . . . . . . . . . 24 | |||
14.2 Informative References . . . . . . . . . . . . . . . . . . 24 | 13.2 Informative References . . . . . . . . . . . . . . . . . . 25 | |||
15.0 Contributors . . . . . . . . . . . . . . . . . . . . . . . 24 | 14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 25 | |||
16. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 24 | 15. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 25 | |||
17. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 24 | 16. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 25 | |||
1. Introduction | 1. Introduction | |||
EVPN may be used as the control plane for a Network Virtualization | Ethernet Virtual Private Networks (EVPN) may be used as the control | |||
Overlay (NVO) network. Network Virtualization Edge (NVE) devices and | plane for a Network Virtualization Overlay (NVO) network. Network | |||
PEs that are part of the same EVI use Ingress Replication (IR) or | Virtualization Edge (NVE) devices and Provider Edges (PEs) that are | |||
part of the same EVPN Instance (EVI) use Ingress Replication (IR) or | ||||
PIM-based trees to transport the tenant's BUM traffic. In NVO | PIM-based trees to transport the tenant's BUM traffic. In NVO | |||
networks where PIM-based trees cannot be used, IR is the only | networks where PIM-based trees cannot be used, IR is the only option. | |||
alternative. Examples of these situations are NVO networks where the | Examples of these situations are NVO networks where the core nodes | |||
core nodes don't support PIM or the network operator does not want to | don't support PIM or the network operator does not want to run PIM in | |||
run PIM in the core. | the core. | |||
In some use-cases, the amount of replication for BUM (Broadcast, | In some use-cases, the amount of replication for BUM (Broadcast, | |||
Unknown unicast and Multicast traffic) is kept under control on the | Unknown unicast and Multicast traffic) is kept under control on the | |||
NVEs due to the following fairly common assumptions: | NVEs due to the following fairly common assumptions: | |||
a) Broadcast is greatly reduced due to the proxy-ARP and proxy-ND | a) Broadcast is greatly reduced due to the proxy ARP (Address | |||
Resolution Protocol) and proxy ND (Neighbor Discovery) | ||||
capabilities supported by EVPN on the NVEs. Some NVEs can even | capabilities supported by EVPN on the NVEs. Some NVEs can even | |||
provide DHCP-server functions for the attached Tenant Systems (TS) | provide Dynamic Host Configuration Protocol(DHCP) server functions | |||
reducing the broadcast even further. | for the attached Tenant Systems (TS) reducing the broadcast even | |||
further. | ||||
b) Unknown unicast traffic is greatly reduced in virtualized NVO | b) Unknown unicast traffic is greatly reduced in virtualized NVO | |||
networks where all the MAC and IP addresses are learnt in the | networks where all the MAC and IP addresses are learnt in the | |||
control plane. | control plane. | |||
c) Multicast applications are not used. | c) Multicast applications are not used. | |||
If the above assumptions are true for a given NVO network, then IR | If the above assumptions are true for a given NVO network, then IR | |||
provides a simple solution for multi-destination traffic. However, | provides a simple solution for multi-destination traffic. However, | |||
the statement c) above is not always true and multicast applications | the statement c) above is not always true and multicast applications | |||
are required in many use-cases. | are required in many use-cases. | |||
When the multicast sources are attached to NVEs residing in | When the multicast sources are attached to NVEs residing in | |||
hypervisors or low-performance-replication TORs, the ingress | hypervisors or low-performance-replication TORs Top Of the Rack | |||
replication of a large amount of multicast traffic to a significant | switches), the ingress replication of a large amount of multicast | |||
number of remote NVEs/PEs can seriously degrade the performance of | traffic to a significant number of remote NVEs/PEs can seriously | |||
the NVE and impact the application. | degrade the performance of the NVE and impact the application. | |||
This document describes a solution that makes use of two IR | This document describes a solution that makes use of two IR | |||
optimizations: | optimizations: | |||
i) Assisted-Replication (AR) | i) Assisted-Replication (AR) | |||
ii) Pruned-Flood-Lists (PFL) | ii) Pruned-Flood-Lists (PFL) | |||
Both optimizations may be used together or independently so that the | Both optimizations may be used together or independently so that the | |||
performance and efficiency of the network to transport multicast can | performance and efficiency of the network to transport multicast can | |||
be improved. Both solutions require some extensions to [RFC7432] that | be improved. Both solutions require some extensions to [RFC7432] that | |||
are described in section 3. | are described in section 3. | |||
Section 2 lists the requirements of the combined optimized-IR | Section 2 lists the requirements of the combined optimized-IR | |||
solution, whereas sections 4 and 5 describe the Assisted-Replication | solution, whereas sections 4 and 5 describe the Assisted-Replication | |||
(AR) solution, and section 6 the Pruned-Flood-Lists (PFL) solution. | (AR) solution, and section 6 the Pruned-Flood-Lists (PFL) solution. | |||
2. Solution requirements | 2. Terminology and Conventions | |||
The IR optimization solution (optimized-IR hereafter) MUST meet the | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
following requirements: | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
"OPTIONAL" in this document are to be interpreted as described in BCP | ||||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
capitals, as shown here. | ||||
a) The solution MUST provide an IR optimization for BM (Broadcast and | The following terminology is used throughout the document: | |||
AC: Attachment Circuit | ||||
Regular-IR: Refers to Regular Ingress Replication, where the source | ||||
NVE/PE sends a copy to each remote NVE/PE part of the EVI. | ||||
AR-IP: IP address owned by the AR-REPLICATOR and used to | ||||
differentiate the ingress traffic that must follow the AR | ||||
procedures. | ||||
IR-IP: IP address used for Ingress Replication as in [RFC7432]. | ||||
AR-VNI: VNI advertised by the AR-REPLICATOR along with the | ||||
Replicator-AR route. It is used to identify the ingress | ||||
packets that must follow AR procedures ONLY in the Single-IP | ||||
AR-REPLICATOR case. | ||||
IR-VNI: VNI advertised along with the RT-3 for IR. | ||||
AR forwarding mode: for an AR-LEAF, it means sending an AC BM packet | ||||
to a single AR-REPLICATOR with tunnel destination IP AR-IP. | ||||
For an AR-REPLICATOR, it means sending a BM packet to a | ||||
selective number or all the overlay tunnels when the packet | ||||
was previously received from an overlay tunnel. | ||||
IR forwarding mode: it refers to the Ingress Replication behavior | ||||
explained in [RFC7432]. It means sending an AC BM packet copy | ||||
to each remote PE/NVE in the EVI and sending an overlay BM | ||||
packet only to the ACs and not other overlay tunnels. | ||||
PTA: PMSI Tunnel Attribute | ||||
RT-3: EVPN Route Type 3, Inclusive Multicast Ethernet Tag route | ||||
RT-11: EVPN Route Type 11, Leaf Auto-Discovery (AD) route | ||||
VXLAN: Virtual Extensible LAN | ||||
GRE: Generic Routing Encapsulation | ||||
NVGRE: Network Virtualization using Generic Routing Encapsulation | ||||
GENEVE: Generic Network Virtualization Encapsulation | ||||
NVO: Network Virtualization Overlay | ||||
NVE: Network Virtualization Edge | ||||
VNI: VXLAN Network Identifier | ||||
EVI: EVPN Instance. An EVPN instance spanning the Provider Edge (PE) | ||||
devices participating in that EVPN | ||||
3. Solution requirements | ||||
The IR optimization solution specified in this document (optimized-IR | ||||
hereafter) meets the following requirements: | ||||
a) The solution provides an IR optimization for BM (Broadcast and | ||||
Multicast) traffic, while preserving the packet order for unicast | Multicast) traffic, while preserving the packet order for unicast | |||
applications, i.e. known and unknown unicast traffic SHALL follow | applications, i.e., known and unknown unicast traffic should | |||
the same path. | follow the same path. | |||
b) The solution MUST be compatible with [RFC7432] and [RFC8365] and | b) The solution is compatible with [RFC7432] and [RFC8365] and has no | |||
have no impact on the EVPN procedures for BM traffic. In | impact on the EVPN procedures for BM traffic. In particular, the | |||
particular, the solution SHOULD support the following EVPN | solution supports the following EVPN functions: | |||
functions: | ||||
o All-active multi-homing, including the split-horizon and | o All-active multi-homing, including the split-horizon and | |||
Designated Forwarder (DF) functions. | Designated Forwarder (DF) functions. | |||
o Single-active multi-homing, including the DF function. | o Single-active multi-homing, including the DF function. | |||
o Handling of multi-destination traffic and processing of | o Handling of multi-destination traffic and processing of | |||
broadcast and multicast as per [RFC7432]. | broadcast and multicast as per [RFC7432]. | |||
c) The solution MUST be backwards compatible with existing NVEs using | c) The solution is backwards compatible with existing NVEs using a | |||
a non-optimized version of IR. A given EVI can have NVEs/PEs | non-optimized version of IR. A given EVI can have NVEs/PEs | |||
supporting regular-IR and optimized-IR. | supporting regular-IR and optimized-IR. | |||
d) The solution MUST be independent of the NVO specific data plane | d) The solution is independent of the NVO specific data plane | |||
encapsulation and the virtual identifiers being used, e.g.: VXLAN | encapsulation and the virtual identifiers being used, e.g.: VXLAN | |||
VNIs, NVGRE VSIDs or MPLS labels. | VNIs, NVGRE VSIDs or MPLS labels, as long as the tunnel is IP- | |||
based. | ||||
3. EVPN BGP Attributes for optimized-IR | 4. EVPN BGP Attributes for optimized-IR | |||
This solution proposes some changes to the [RFC7432] Inclusive | This solution extends the [RFC7432] Inclusive Multicast Ethernet Tag | |||
Multicast Ethernet Tag routes and attributes so that an NVE/PE can | routes and attributes so that an NVE/PE can signal its optimized-IR | |||
signal its optimized-IR capabilities. | capabilities. | |||
The Inclusive Multicast Ethernet Tag route (RT-3) and its PMSI Tunnel | The Inclusive Multicast Ethernet Tag route (RT-3) and its PMSI Tunnel | |||
Attribute's (PTA) general format used in [RFC7432] are shown below: | Attribute's (PTA) general format used in [RFC7432] are shown below: | |||
+---------------------------------+ | +---------------------------------+ | |||
| RD (8 octets) | | | RD (8 octets) | | |||
+---------------------------------+ | +---------------------------------+ | |||
| Ethernet Tag ID (4 octets) | | | Ethernet Tag ID (4 octets) | | |||
+---------------------------------+ | +---------------------------------+ | |||
| IP Address Length (1 octet) | | | IP Address Length (1 octet) | | |||
skipping to change at page 5, line 50 ¶ | skipping to change at page 7, line 30 ¶ | |||
+---------------------------------+ | +---------------------------------+ | |||
| MPLS Label (3 octets) | | | MPLS Label (3 octets) | | |||
+---------------------------------+ | +---------------------------------+ | |||
| Tunnel Identifier (variable) | | | Tunnel Identifier (variable) | | |||
+---------------------------------+ | +---------------------------------+ | |||
The Flags field is defined as follows: | The Flags field is defined as follows: | |||
0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 | |||
+-+-+-+-+-+--+-+-+ | +-+-+-+-+-+--+-+-+ | |||
|rsved| T |BM|U|L| | |rsvd | T |BM|U|L| | |||
+-+-+-+-+-+--+-+-+ | +-+-+-+-+-+--+-+-+ | |||
Where a new type field (for AR) and two new flags (for PFL signaling) | Where a new type field (for AR) and two new flags (for PFL signaling) | |||
are defined: | are defined: | |||
- T is the AR Type field (2 bits) that defines the AR role of the | - T is the AR Type field (2 bits) that defines the AR role of the | |||
advertising router: | advertising router: | |||
+ 00 (decimal 0) = RNVE (non-AR support) | + 00 (decimal 0) = RNVE (non-AR support) | |||
skipping to change at page 8, line 5 ¶ | skipping to change at page 9, line 32 ¶ | |||
flags. Note that these BM/U flags may be used to optimize the | flags. Note that these BM/U flags may be used to optimize the | |||
delivery of multi-destination traffic and its use SHOULD be an | delivery of multi-destination traffic and its use SHOULD be an | |||
administrative choice, and independent of the AR role. | administrative choice, and independent of the AR role. | |||
Non-optimized-IR nodes will be unaware of the new PMSI attribute flag | Non-optimized-IR nodes will be unaware of the new PMSI attribute flag | |||
definition as well as the new Tunnel Type (AR), i.e. they will ignore | definition as well as the new Tunnel Type (AR), i.e. they will ignore | |||
the information contained in the flags field for any RT-3 and will | the information contained in the flags field for any RT-3 and will | |||
ignore the RT-3 routes with an unknown Tunnel Type (type AR in this | ignore the RT-3 routes with an unknown Tunnel Type (type AR in this | |||
case). | case). | |||
4. Non-selective Assisted-Replication (AR) Solution Description | 5. Non-selective Assisted-Replication (AR) Solution Description | |||
The following figure illustrates an example NVO network where the | The following figure illustrates an example NVO network where the | |||
non-selective AR function is enabled. Three different roles are | non-selective AR function is enabled. Three different roles are | |||
defined for a given EVI: AR-REPLICATOR, AR-LEAF and RNVE (Regular | defined for a given EVI: AR-REPLICATOR, AR-LEAF and RNVE (Regular | |||
NVE). The solution is called "non-selective" because the chosen AR- | NVE). The solution is called "non-selective" because the chosen AR- | |||
REPLICATOR for a given flow MUST replicate the multicast traffic to | REPLICATOR for a given flow MUST replicate the multicast traffic to | |||
'all' the NVE/PEs in the EVI except for the source NVE/PE. | 'all' the NVE/PEs in the EVI except for the source NVE/PE. | |||
( ) | ( ) | |||
(_ WAN _) | (_ WAN _) | |||
skipping to change at page 8, line 41 ¶ | skipping to change at page 10, line 32 ¶ | |||
Hypervisor| TOR | NVE2 |Hypervisor | Hypervisor| TOR | NVE2 |Hypervisor | |||
+---------+-+ +-----+-----+ +-+---------+ | +---------+-+ +-----+-----+ +-+---------+ | |||
| (EVI-1) | | (EVI-1) | | (EVI-1) | | | (EVI-1) | | (EVI-1) | | (EVI-1) | | |||
| LEAF | | RNVE | | LEAF | | | LEAF | | RNVE | | LEAF | | |||
+--+-----+--+ +--+-----+--+ +--+-----+--+ | +--+-----+--+ +--+-----+--+ +--+-----+--+ | |||
| | | | | | | | | | | | | | |||
VM11 VM12 TS3 TS4 VM31 VM32 | VM11 VM12 TS3 TS4 VM31 VM32 | |||
Figure 1 Optimized-IR scenario | Figure 1 Optimized-IR scenario | |||
4.1. Non-selective AR-REPLICATOR procedures | 5.1. Non-selective AR-REPLICATOR procedures | |||
An AR-REPLICATOR is defined as an NVE/PE capable of replicating | An AR-REPLICATOR is defined as an NVE/PE capable of replicating | |||
ingress BM (Broadcast and Multicast) traffic received on an overlay | ingress BM (Broadcast and Multicast) traffic received on an overlay | |||
tunnel to other overlay tunnels and local Attachment Circuits (ACs). | tunnel to other overlay tunnels and local Attachment Circuits (ACs). | |||
The AR-REPLICATOR signals its role in the control plane and | The AR-REPLICATOR signals its role in the control plane and | |||
understands where the other roles (AR-LEAF nodes, RNVEs and other AR- | understands where the other roles (AR-LEAF nodes, RNVEs and other AR- | |||
REPLICATORs) are located. A given AR-enabled EVI service may have | REPLICATORs) are located. A given AR-enabled EVI service may have | |||
zero, one or more AR-REPLICATORs. In our example in figure 1, PE1 and | zero, one or more AR-REPLICATORs. In our example in figure 1, PE1 and | |||
PE2 are defined as AR-REPLICATORs. The following considerations apply | PE2 are defined as AR-REPLICATORs. The following considerations apply | |||
to the AR-REPLICATOR role: | to the AR-REPLICATOR role: | |||
skipping to change at page 9, line 36 ¶ | skipping to change at page 11, line 27 ¶ | |||
o If the destination IP is the AR-REPLICATOR AR-IP Address the | o If the destination IP is the AR-REPLICATOR AR-IP Address the | |||
node MUST replicate the packet to local ACs and overlay | node MUST replicate the packet to local ACs and overlay | |||
tunnels (excluding the overlay tunnel to the source of the | tunnels (excluding the overlay tunnel to the source of the | |||
packet). When replicating to remote AR-REPLICATORs the tunnel | packet). When replicating to remote AR-REPLICATORs the tunnel | |||
destination IP will be an IR-IP. That will be an indication | destination IP will be an IR-IP. That will be an indication | |||
for the remote AR-REPLICATOR that it MUST NOT replicate to | for the remote AR-REPLICATOR that it MUST NOT replicate to | |||
overlay tunnels. The tunnel source IP used by the AR- | overlay tunnels. The tunnel source IP used by the AR- | |||
REPLICATOR MUST be its IR-IP. | REPLICATOR MUST be its IR-IP. | |||
4.2. Non-selective AR-LEAF procedures | 5.2. Non-selective AR-LEAF procedures | |||
AR-LEAF is defined as an NVE/PE that - given its poor replication | AR-LEAF is defined as an NVE/PE that - given its poor replication | |||
performance - sends all the BM traffic to an AR-REPLICATOR that can | performance - sends all the BM traffic to an AR-REPLICATOR that can | |||
replicate the traffic further on its behalf. It MAY signal its AR- | replicate the traffic further on its behalf. It MAY signal its AR- | |||
LEAF capability in the control plane and understands where the other | LEAF capability in the control plane and understands where the other | |||
roles are located (AR-REPLICATOR and RNVEs). A given service can have | roles are located (AR-REPLICATOR and RNVEs). A given service can have | |||
zero, one or more AR-LEAF nodes. Figure 1 shows NVE1 and NVE3 (both | zero, one or more AR-LEAF nodes. Figure 1 shows NVE1 and NVE3 (both | |||
residing in hypervisors) acting as AR-LEAF. The following | residing in hypervisors) acting as AR-LEAF. The following | |||
considerations apply to the AR-LEAF role: | considerations apply to the AR-LEAF role: | |||
skipping to change at page 11, line 8 ¶ | skipping to change at page 12, line 49 ¶ | |||
NOT replicate these control plane packets to other overlay | NOT replicate these control plane packets to other overlay | |||
tunnels since they will use the regular IR-IP Address. | tunnels since they will use the regular IR-IP Address. | |||
e) The use of an AR-REPLICATOR-activation-timer (in seconds) on the | e) The use of an AR-REPLICATOR-activation-timer (in seconds) on the | |||
AR-LEAF nodes is RECOMMENDED. Upon receiving a new Replicator-AR | AR-LEAF nodes is RECOMMENDED. Upon receiving a new Replicator-AR | |||
route where the AR-REPLICATOR is selected, the AR-LEAF will run a | route where the AR-REPLICATOR is selected, the AR-LEAF will run a | |||
timer before programming the new AR-REPLICATOR. This will give the | timer before programming the new AR-REPLICATOR. This will give the | |||
AR-REPLICATOR some time to program the AR-LEAF nodes before the | AR-REPLICATOR some time to program the AR-LEAF nodes before the | |||
AR-LEAF sends BM traffic. | AR-LEAF sends BM traffic. | |||
4.3. RNVE procedures | 5.3. RNVE procedures | |||
RNVE (Regular Network Virtualization Edge node) is defined as an | RNVE (Regular Network Virtualization Edge node) is defined as an | |||
NVE/PE without AR-REPLICATOR or AR-LEAF capabilities that does IR as | NVE/PE without AR-REPLICATOR or AR-LEAF capabilities that does IR as | |||
described in [RFC7432]. The RNVE does not signal any AR role and is | described in [RFC7432]. The RNVE does not signal any AR role and is | |||
unaware of the AR-REPLICATOR/LEAF roles in the EVI. The RNVE will | unaware of the AR-REPLICATOR/LEAF roles in the EVI. The RNVE will | |||
ignore the Flags in the Regular-IR routes and will ignore the | ignore the Flags in the Regular-IR routes and will ignore the | |||
Replicator-AR routes (due to an unknown tunnel type in the PTA) and | Replicator-AR routes (due to an unknown tunnel type in the PTA) and | |||
the Leaf-AD routes (due to the IP-address-specific route-target). | the Leaf-AD routes (due to the IP-address-specific route-target). | |||
This role provides EVPN with the backwards compatibility required in | This role provides EVPN with the backwards compatibility required in | |||
optimized-IR EVIs. Figure 1 shows NVE2 as RNVE. | optimized-IR EVIs. Figure 1 shows NVE2 as RNVE. | |||
4.4. Forwarding behavior in non-selective AR EVIs | 5.4. Forwarding behavior in non-selective AR EVIs | |||
In AR EVIs, BM (Broadcast and Multicast) traffic between two NVEs may | In AR EVIs, BM (Broadcast and Multicast) traffic between two NVEs may | |||
follow a different path than unicast traffic. This solution proposes | follow a different path than unicast traffic. This solution | |||
the replication of BM through the AR-REPLICATOR node, whereas | recommends the replication of BM through the AR-REPLICATOR node, | |||
unknown/known unicast will be delivered directly from the source node | whereas unknown/known unicast will be delivered directly from the | |||
to the destination node without being replicated by any intermediate | source node to the destination node without being replicated by any | |||
node. Unknown unicast SHALL follow the same path as known unicast | intermediate node. Unknown unicast SHALL follow the same path as | |||
traffic in order to avoid packet reordering for unicast applications | known unicast traffic in order to avoid packet reordering for unicast | |||
and simplify the control and data plane procedures. Section 4.4.1. | applications and simplify the control and data plane procedures. | |||
describes the expected forwarding behavior for BM traffic in nodes | Section 4.4.1. describes the expected forwarding behavior for BM | |||
acting as AR-REPLICATOR, AR-LEAF and RNVE. Section 4.4.2. describes | traffic in nodes acting as AR-REPLICATOR, AR-LEAF and RNVE. Section | |||
the forwarding behavior for unknown unicast traffic. | 4.4.2. describes the forwarding behavior for unknown unicast traffic. | |||
Note that known unicast forwarding is not impacted by this solution. | Note that known unicast forwarding is not impacted by this solution. | |||
4.4.1. Broadcast and Multicast forwarding behavior | 5.4.1. Broadcast and Multicast forwarding behavior | |||
The expected behavior per role is described in this section. | The expected behavior per role is described in this section. | |||
4.4.1.1. Non-selective AR-REPLICATOR BM forwarding | 5.4.1.1. Non-selective AR-REPLICATOR BM forwarding | |||
The AR-REPLICATORs will build a flooding list composed of ACs and | The AR-REPLICATORs will build a flooding list composed of ACs and | |||
overlay tunnels to remote nodes in the EVI. Some of those overlay | overlay tunnels to remote nodes in the EVI. Some of those overlay | |||
tunnels MAY be flagged as non-BM receivers based on the BM flag | tunnels MAY be flagged as non-BM receivers based on the BM flag | |||
received from the remote nodes in the EVI. | received from the remote nodes in the EVI. | |||
o When an AR-REPLICATOR receives a BM packet on an AC, it will | o When an AR-REPLICATOR receives a BM packet on an AC, it will | |||
forward the BM packet to its flooding list (including local ACs and | forward the BM packet to its flooding list (including local ACs and | |||
remote NVE/PEs), skipping the non-BM overlay tunnels. | remote NVE/PEs), skipping the non-BM overlay tunnels. | |||
skipping to change at page 12, line 19 ¶ | skipping to change at page 14, line 11 ¶ | |||
forward the BM packet to its flooding list (ACs and overlay | forward the BM packet to its flooding list (ACs and overlay | |||
tunnels) excluding the non-BM overlay tunnels. The AR-REPLICATOR | tunnels) excluding the non-BM overlay tunnels. The AR-REPLICATOR | |||
will do source squelching to ensure the traffic is not sent back | will do source squelching to ensure the traffic is not sent back | |||
to the originating AR-LEAF. | to the originating AR-LEAF. | |||
- If the destination IP matches its IR-IP, the AR-REPLICATOR will | - If the destination IP matches its IR-IP, the AR-REPLICATOR will | |||
skip all the overlay tunnels from the flooding list, i.e. it | skip all the overlay tunnels from the flooding list, i.e. it | |||
will only replicate to local ACs. This is the regular IR | will only replicate to local ACs. This is the regular IR | |||
behavior described in [RFC7432]. | behavior described in [RFC7432]. | |||
4.4.1.2. Non-selective AR-LEAF BM forwarding | 5.4.1.2. Non-selective AR-LEAF BM forwarding | |||
The AR-LEAF nodes will build two flood-lists: | The AR-LEAF nodes will build two flood-lists: | |||
1) Flood-list #1 - composed of ACs and an AR-REPLICATOR-set of | 1) Flood-list #1 - composed of ACs and an AR-REPLICATOR-set of | |||
overlay tunnels. The AR-REPLICATOR-set is defined as one or more | overlay tunnels. The AR-REPLICATOR-set is defined as one or more | |||
overlay tunnels to the AR-IP Addresses of the remote AR- | overlay tunnels to the AR-IP Addresses of the remote AR- | |||
REPLICATOR(s) in the EVI. The selection of more than one AR- | REPLICATOR(s) in the EVI. The selection of more than one AR- | |||
REPLICATOR is described in section 4.2. and it is a local AR- | REPLICATOR is described in section 4.2. and it is a local AR- | |||
LEAF decision. | LEAF decision. | |||
skipping to change at page 12, line 47 ¶ | skipping to change at page 14, line 39 ¶ | |||
to flood-list #2. | to flood-list #2. | |||
o If the AR-REPLICATOR-set is NOT empty, the AR-LEAF will send the | o If the AR-REPLICATOR-set is NOT empty, the AR-LEAF will send the | |||
packet to flood-list #1, where only one of the overlay tunnels of | packet to flood-list #1, where only one of the overlay tunnels of | |||
the AR-REPLICATOR-set is used. | the AR-REPLICATOR-set is used. | |||
When an AR-LEAF receives a BM packet on an overlay tunnel, will | When an AR-LEAF receives a BM packet on an overlay tunnel, will | |||
forward the BM packet to its local ACs and never to an overlay | forward the BM packet to its local ACs and never to an overlay | |||
tunnel. This is the regular IR behavior described in [RFC7432]. | tunnel. This is the regular IR behavior described in [RFC7432]. | |||
4.4.1.3. RNVE BM forwarding | 5.4.1.3. RNVE BM forwarding | |||
The RNVE is completely unaware of the AR-REPLICATORs, AR-LEAF nodes | The RNVE is completely unaware of the AR-REPLICATORs, AR-LEAF nodes | |||
and BM/U flags (that information is ignored). Its forwarding behavior | and BM/U flags (that information is ignored). Its forwarding behavior | |||
is the regular IR behavior described in [RFC7432]. Any regular non-AR | is the regular IR behavior described in [RFC7432]. Any regular non-AR | |||
node is fully compatible with the RNVE role described in this | node is fully compatible with the RNVE role described in this | |||
document. | document. | |||
4.4.2. Unknown unicast forwarding behavior | 5.4.2. Unknown unicast forwarding behavior | |||
The expected behavior is described in this section. | The expected behavior is described in this section. | |||
4.4.2.1. Non-selective AR-REPLICATOR/LEAF Unknown unicast forwarding | 5.4.2.1. Non-selective AR-REPLICATOR/LEAF Unknown unicast forwarding | |||
While the forwarding behavior in AR-REPLICATORs and AR-LEAF nodes is | While the forwarding behavior in AR-REPLICATORs and AR-LEAF nodes is | |||
different for BM traffic, as far as Unknown unicast traffic | different for BM traffic, as far as Unknown unicast traffic | |||
forwarding is concerned, AR-LEAF nodes behave exactly in the same way | forwarding is concerned, AR-LEAF nodes behave exactly in the same way | |||
as AR-REPLICATORs do. | as AR-REPLICATORs do. | |||
The AR-REPLICATOR/LEAF nodes will build a flood-list composed of ACs | The AR-REPLICATOR/LEAF nodes will build a flood-list composed of ACs | |||
and overlay tunnels to the IR-IP Addresses of the remote nodes in the | and overlay tunnels to the IR-IP Addresses of the remote nodes in the | |||
EVI. Some of those overlay tunnels MAY be flagged as non-U (Unknown | EVI. Some of those overlay tunnels MAY be flagged as non-U (Unknown | |||
unicast) receivers based on the U flag received from the remote nodes | unicast) receivers based on the U flag received from the remote nodes | |||
skipping to change at page 13, line 33 ¶ | skipping to change at page 15, line 27 ¶ | |||
o When an AR-REPLICATOR/LEAF receives an unknown packet on an AC, it | o When an AR-REPLICATOR/LEAF receives an unknown packet on an AC, it | |||
will forward the unknown packet to its flood-list, skipping the | will forward the unknown packet to its flood-list, skipping the | |||
non-U overlay tunnels. | non-U overlay tunnels. | |||
o When an AR-REPLICATOR/LEAF receives an unknown packet on an overlay | o When an AR-REPLICATOR/LEAF receives an unknown packet on an overlay | |||
tunnel will forward the unknown packet to its local ACs and never | tunnel will forward the unknown packet to its local ACs and never | |||
to an overlay tunnel. This is the regular IR behavior described in | to an overlay tunnel. This is the regular IR behavior described in | |||
[RFC7432]. | [RFC7432]. | |||
4.4.2.2. RNVE Unknown unicast forwarding | 5.4.2.2. RNVE Unknown unicast forwarding | |||
As described for BM traffic, the RNVE is completely unaware of the | As described for BM traffic, the RNVE is completely unaware of the | |||
REPLICATORs, LEAF nodes and BM/U flags (that information is ignored). | REPLICATORs, LEAF nodes and BM/U flags (that information is ignored). | |||
Its forwarding behavior is the regular IR behavior described in | Its forwarding behavior is the regular IR behavior described in | |||
[RFC7432], also for Unknown unicast traffic. Any regular non-AR node | [RFC7432], also for Unknown unicast traffic. Any regular non-AR node | |||
is fully compatible with the RNVE role described in this document. | is fully compatible with the RNVE role described in this document. | |||
5. Selective Assisted-Replication (AR) Solution Description | 6. Selective Assisted-Replication (AR) Solution Description | |||
Figure 1 is also used to describe the selective AR solution, however | Figure 1 is also used to describe the selective AR solution, however | |||
in this section we consider NVE2 as one more AR-LEAF for EVI-1. The | in this section we consider NVE2 as one more AR-LEAF for EVI-1. The | |||
solution is called "selective" because a given AR-REPLICATOR MUST | solution is called "selective" because a given AR-REPLICATOR MUST | |||
replicate the BM traffic to only the AR-LEAF that requested the | replicate the BM traffic to only the AR-LEAF that requested the | |||
replication (as opposed to all the AR-LEAF nodes) and MAY replicate | replication (as opposed to all the AR-LEAF nodes) and MAY replicate | |||
the BM traffic to the RNVEs. The same AR roles defined in section 4 | the BM traffic to the RNVEs. The same AR roles defined in section 4 | |||
are used here, however the procedures are slightly different. | are used here, however the procedures are slightly different. | |||
The following sub-sections describe the differences in the procedures | The following sub-sections describe the differences in the procedures | |||
of AR-REPLICATOR/LEAFs compared to the non-selective AR solution. | of AR-REPLICATOR/LEAFs compared to the non-selective AR solution. | |||
There is no change on the RNVEs. | There is no change on the RNVEs. | |||
5.1. Selective AR-REPLICATOR procedures | 6.1. Selective AR-REPLICATOR procedures | |||
In our example in figure 1, PE1 and PE2 are defined as Selective AR- | In our example in figure 1, PE1 and PE2 are defined as Selective AR- | |||
REPLICATORs. The following considerations apply to the Selective AR- | REPLICATORs. The following considerations apply to the Selective AR- | |||
REPLICATOR role: | REPLICATOR role: | |||
a) The Selective AR-REPLICATOR capability SHOULD be an administrative | a) The Selective AR-REPLICATOR capability SHOULD be an administrative | |||
choice in any NVE/PE that is part of an AR-enabled EVI, as the AR | choice in any NVE/PE that is part of an AR-enabled EVI, as the AR | |||
role itself. This administrative option MAY be implemented as a | role itself. This administrative option MAY be implemented as a | |||
system level option as opposed to as a per-MAC-VRF option. | system level option as opposed to as a per-MAC-VRF option. | |||
b) Each AR-REPLICATOR will build a list of AR-REPLICATOR, AR-LEAF and | b) Each AR-REPLICATOR will build a list of AR-REPLICATOR, AR-LEAF and | |||
skipping to change at page 15, line 23 ¶ | skipping to change at page 17, line 16 ¶ | |||
to all the RNVEs. | to all the RNVEs. | |||
+ overlay tunnels to the remote Selective AR-REPLICATORs if | + overlay tunnels to the remote Selective AR-REPLICATORs if | |||
the tunnel source IP is an IR-IP of its own AR-LEAF-set (in | the tunnel source IP is an IR-IP of its own AR-LEAF-set (in | |||
any other case, the AR-REPLICATOR MUST NOT replicate the BM | any other case, the AR-REPLICATOR MUST NOT replicate the BM | |||
traffic to remote AR-REPLICATORs), where the tunnel | traffic to remote AR-REPLICATORs), where the tunnel | |||
destination IP is the AR-IP of the remote Selective AR- | destination IP is the AR-IP of the remote Selective AR- | |||
REPLICATOR. The tunnel destination IP AR-IP will be an | REPLICATOR. The tunnel destination IP AR-IP will be an | |||
indication for the remote Selective AR-REPLICATOR that the | indication for the remote Selective AR-REPLICATOR that the | |||
packet needs further replication to its AR-LEAFs. | packet needs further replication to its AR-LEAFs. | |||
5.2. Selective AR-LEAF procedures | 6.2. Selective AR-LEAF procedures | |||
A Selective AR-LEAF chooses a single Selective AR-REPLICATOR per EVI | A Selective AR-LEAF chooses a single Selective AR-REPLICATOR per EVI | |||
and: | and: | |||
o Sends all the EVI BM traffic to that AR-REPLICATOR and | o Sends all the EVI BM traffic to that AR-REPLICATOR and | |||
o Expects to receive the BM traffic for a given EVI from the same AR- | o Expects to receive the BM traffic for a given EVI from the same AR- | |||
REPLICATOR. | REPLICATOR. | |||
In the example of Figure 1, we consider NVE1/NVE2/NVE3 as Selective | In the example of Figure 1, we consider NVE1/NVE2/NVE3 as Selective | |||
AR-LEAFs. NVE1 selects PE1 as its Selective AR-REPLICATOR. If that is | AR-LEAFs. NVE1 selects PE1 as its Selective AR-REPLICATOR. If that is | |||
skipping to change at page 16, line 34 ¶ | skipping to change at page 18, line 27 ¶ | |||
timer expires, the Selective AR-LEAF will resume its AR mode | timer expires, the Selective AR-LEAF will resume its AR mode | |||
with the new Selective AR-REPLICATOR. | with the new Selective AR-REPLICATOR. | |||
All the AR-LEAFs in an EVI are expected to be configured as either | All the AR-LEAFs in an EVI are expected to be configured as either | |||
selective or non-selective. A mix of selective and non-selective AR- | selective or non-selective. A mix of selective and non-selective AR- | |||
LEAFs SHOULD NOT coexist in the same EVI. In case there is a non- | LEAFs SHOULD NOT coexist in the same EVI. In case there is a non- | |||
selective AR-LEAF, its BM traffic sent to a selective AR-REPLICATOR | selective AR-LEAF, its BM traffic sent to a selective AR-REPLICATOR | |||
will not be replicated to other AR-LEAFs that are not in its | will not be replicated to other AR-LEAFs that are not in its | |||
Selective AR-LEAF-set. | Selective AR-LEAF-set. | |||
5.3. Forwarding behavior in selective AR EVIs | 6.3. Forwarding behavior in selective AR EVIs | |||
This section describes the differences of the selective AR forwarding | This section describes the differences of the selective AR forwarding | |||
mode compared to the non-selective mode. Compared to section 4.4, | mode compared to the non-selective mode. Compared to section 4.4, | |||
there are no changes for the forwarding behavior in RNVEs or for | there are no changes for the forwarding behavior in RNVEs or for | |||
unknown unicast traffic. | unknown unicast traffic. | |||
5.3.1. Selective AR-REPLICATOR BM forwarding | 6.3.1. Selective AR-REPLICATOR BM forwarding | |||
The Selective AR-REPLICATORs will build two flood-lists: | The Selective AR-REPLICATORs will build two flood-lists: | |||
1) Flood-list #1 - composed of ACs and overlay tunnels to the | 1) Flood-list #1 - composed of ACs and overlay tunnels to the | |||
remote nodes in the EVI, always using the IR-IPs in the tunnel | remote nodes in the EVI, always using the IR-IPs in the tunnel | |||
destination IP addresses. Some of those overlay tunnels MAY be | destination IP addresses. Some of those overlay tunnels MAY be | |||
flagged as non-BM receivers based on the BM flag received from | flagged as non-BM receivers based on the BM flag received from | |||
the remote nodes in the EVI. | the remote nodes in the EVI. | |||
2) Flood-list #2 - composed of ACs, a Selective AR-LEAF-set and a | 2) Flood-list #2 - composed of ACs, a Selective AR-LEAF-set and a | |||
skipping to change at page 17, line 48 ¶ | skipping to change at page 19, line 43 ¶ | |||
flooding list, i.e. it will only replicate to local ACs. This is | flooding list, i.e. it will only replicate to local ACs. This is | |||
the regular-IR behavior described in [RFC7432]. | the regular-IR behavior described in [RFC7432]. | |||
In any case, non-BM overlay tunnels are excluded from flood-lists | In any case, non-BM overlay tunnels are excluded from flood-lists | |||
and, also, source squelching is always done in order to ensure the | and, also, source squelching is always done in order to ensure the | |||
traffic is not sent back to the originating source. If the | traffic is not sent back to the originating source. If the | |||
encapsulation is MPLSoGRE (or MPLSoUDP) and the EVI label is not the | encapsulation is MPLSoGRE (or MPLSoUDP) and the EVI label is not the | |||
bottom of the stack, the AR-REPLICATOR MUST copy the rest of the | bottom of the stack, the AR-REPLICATOR MUST copy the rest of the | |||
labels when forwarding them to the egress overlay tunnels. | labels when forwarding them to the egress overlay tunnels. | |||
5.3.2. Selective AR-LEAF BM forwarding | 6.3.2. Selective AR-LEAF BM forwarding | |||
The Selective AR-LEAF nodes will build two flood-lists: | The Selective AR-LEAF nodes will build two flood-lists: | |||
1) Flood-list #1 - composed of ACs and the overlay tunnel to the | 1) Flood-list #1 - composed of ACs and the overlay tunnel to the | |||
selected AR-REPLICATOR (using the AR-IP as the tunnel | selected AR-REPLICATOR (using the AR-IP as the tunnel | |||
destination IP). | destination IP). | |||
2) Flood-list #2 - composed of ACs and overlay tunnels to the | 2) Flood-list #2 - composed of ACs and overlay tunnels to the | |||
remote IR-IP Addresses. | remote IR-IP Addresses. | |||
When an AR-LEAF receives a BM packet on an AC, it will check if there | When an AR-LEAF receives a BM packet on an AC, it will check if there | |||
is any selected AR-REPLICATOR. If there is, flood-list #1 will be | is any selected AR-REPLICATOR. If there is, flood-list #1 will be | |||
used. Otherwise, flood-list #2 will. | used. Otherwise, flood-list #2 will. | |||
When an AR-LEAF receives a BM packet on an overlay tunnel, will | When an AR-LEAF receives a BM packet on an overlay tunnel, will | |||
forward the BM packet to its local ACs and never to an overlay | forward the BM packet to its local ACs and never to an overlay | |||
tunnel. This is the regular IR behavior described in [RFC7432]. | tunnel. This is the regular IR behavior described in [RFC7432]. | |||
6. Pruned-Flood-Lists (PFL) | 7. Pruned-Flood-Lists (PFL) | |||
In addition to AR, the second optimization supported by this solution | In addition to AR, the second optimization supported by this solution | |||
is the ability for the all the EVI nodes to signal Pruned-Flood-Lists | is the ability for the all the EVI nodes to signal Pruned-Flood-Lists | |||
(PFL). As described in section 3, an EVPN node can signal a given | (PFL). As described in section 3, an EVPN node can signal a given | |||
value for the BM and U PFL flags in the IR Inclusive Multicast | value for the BM and U PFL flags in the IR Inclusive Multicast | |||
Routes, where: | Routes, where: | |||
+ BM= Broadcast and Multicast (BM) flag. BM=1 means "prune-me" from | + BM= Broadcast and Multicast (BM) flag. BM=1 means "prune-me" from | |||
the BM flood-list. BM=0 means regular behavior. | the BM flood-list. BM=0 means regular behavior. | |||
skipping to change at page 18, line 43 ¶ | skipping to change at page 20, line 37 ¶ | |||
The ability to signal these PFL flags is an administrative choice. | The ability to signal these PFL flags is an administrative choice. | |||
Upon receiving a non-zero PFL flag, a node MAY decide to honor the | Upon receiving a non-zero PFL flag, a node MAY decide to honor the | |||
PFL flag and remove the sender from the corresponding flood-list. A | PFL flag and remove the sender from the corresponding flood-list. A | |||
given EVI node receiving BUM traffic on an overlay tunnel MUST | given EVI node receiving BUM traffic on an overlay tunnel MUST | |||
replicate the traffic normally, regardless of the signaled PFL | replicate the traffic normally, regardless of the signaled PFL | |||
flags. | flags. | |||
This optimization MAY be used along with the AR solution. | This optimization MAY be used along with the AR solution. | |||
6.1. A PFL example | 7.1. A PFL example | |||
In order to illustrate the use of the solution described in this | In order to illustrate the use of the solution described in this | |||
document, we will assume that EVI-1 in figure 1 is optimized-IR | document, we will assume that EVI-1 in figure 1 is optimized-IR | |||
enabled and: | enabled and: | |||
o PE1 and PE2 are administratively configured as AR-REPLICATORs, due | o PE1 and PE2 are administratively configured as AR-REPLICATORs, due | |||
to their high-performance replication capabilities. PE1 and PE2 | to their high-performance replication capabilities. PE1 and PE2 | |||
will send a Replicator-AR route with BM/U flags = 00. | will send a Replicator-AR route with BM/U flags = 00. | |||
o NVE1 and NVE3 are administratively configured as AR-LEAF nodes, due | o NVE1 and NVE3 are administratively configured as AR-LEAF nodes, due | |||
skipping to change at page 19, line 39 ¶ | skipping to change at page 21, line 34 ¶ | |||
(3) Any Unknown unicast packet sent from VM31 will be forwarded by | (3) Any Unknown unicast packet sent from VM31 will be forwarded by | |||
NVE3 to NVE2, PE1 and PE2 but not NVE1. The solution avoids the | NVE3 to NVE2, PE1 and PE2 but not NVE1. The solution avoids the | |||
unnecessary replication to NVE1, since the destination of the | unnecessary replication to NVE1, since the destination of the | |||
unknown traffic cannot be at NVE1. | unknown traffic cannot be at NVE1. | |||
(4) Any Unknown unicast packet sent from TS1 will be forwarded by PE1 | (4) Any Unknown unicast packet sent from TS1 will be forwarded by PE1 | |||
to the WAN link, PE2 and NVE2 but not to NVE1 and NVE3, since the | to the WAN link, PE2 and NVE2 but not to NVE1 and NVE3, since the | |||
target of the unknown traffic cannot be at those NVEs. | target of the unknown traffic cannot be at those NVEs. | |||
7. AR Procedures for single-IP AR-REPLICATORS | 8. AR Procedures for single-IP AR-REPLICATORS | |||
The procedures explained in sections 4 (Non-selective AR) and 5 | The procedures explained in sections 4 (Non-selective AR) and 5 | |||
(Selective AR) assume that the AR-REPLICATOR can use two local | (Selective AR) assume that the AR-REPLICATOR can use two local | |||
routable IP addresses to terminate and originate NVO tunnels, i.e. | routable IP addresses to terminate and originate NVO tunnels, i.e. | |||
IR-IP and AR-IP addresses. This is usually the case for PE-based AR- | IR-IP and AR-IP addresses. This is usually the case for PE-based AR- | |||
REPLICATOR nodes. | REPLICATOR nodes. | |||
In some cases, the AR-REPLICATOR node does not support more than one | In some cases, the AR-REPLICATOR node does not support more than one | |||
IP address to terminate and originate NVO tunnels, i.e. the IR-IP and | IP address to terminate and originate NVO tunnels, i.e. the IR-IP and | |||
AR-IP are the same IP addresses. This may be the case in some | AR-IP are the same IP addresses. This may be the case in some | |||
skipping to change at page 20, line 24 ¶ | skipping to change at page 22, line 20 ¶ | |||
o An AR-REPLICATOR will perform IR or AR forwarding mode for the | o An AR-REPLICATOR will perform IR or AR forwarding mode for the | |||
incoming Overlay packets based on an ingress VNI lookup, as opposed | incoming Overlay packets based on an ingress VNI lookup, as opposed | |||
to the tunnel IP DA lookup described in sections 4 and 5. Note | to the tunnel IP DA lookup described in sections 4 and 5. Note | |||
that, when replicating to remote AR-REPLICATOR nodes, the use of | that, when replicating to remote AR-REPLICATOR nodes, the use of | |||
the IR-VNI or AR-VNI advertised by the egress node will determine | the IR-VNI or AR-VNI advertised by the egress node will determine | |||
the IR or AR forwarding mode at the subsequent AR-REPLICATOR. | the IR or AR forwarding mode at the subsequent AR-REPLICATOR. | |||
The rest of the procedures will follow what is described in sections | The rest of the procedures will follow what is described in sections | |||
4 and 5. | 4 and 5. | |||
8. AR Procedures and EVPN All-Active Multi-homing Split-Horizon | 9. AR Procedures and EVPN All-Active Multi-homing Split-Horizon | |||
8.1. Ethernet Segments on AR-LEAF nodes | This section extends the procedures for the cases where AR-LEAF nodes | |||
or AR-REPLICATOR nodes are attached to the the same Ethernet Segment | ||||
in the Broadcast Domain. The case where one (or more) AR-LEAF node(s) | ||||
and one (or more) AR-REPLICATOR node(s) are attached to the same | ||||
Ethernet Segment is out of scope. | ||||
9.1. Ethernet Segments on AR-LEAF nodes | ||||
If VXLAN or NVGRE are used, and if the Split-horizon is based on the | If VXLAN or NVGRE are used, and if the Split-horizon is based on the | |||
tunnel IP SA and "Local-Bias" as described in [RFC8365], the Split- | tunnel IP SA and "Local-Bias" as described in [RFC8365], the Split- | |||
horizon check will not work if there is an Ethernet-Segment shared | horizon check will not work if there is an Ethernet-Segment shared | |||
between two AR-LEAF nodes, and the AR-REPLICATOR changes the tunnel | between two AR-LEAF nodes, and the AR-REPLICATOR changes the tunnel | |||
IP SA of the packets with its own AR-IP. | IP SA of the packets with its own AR-IP. | |||
In order to be compatible with the IP SA split-horizon check, the AR- | In order to be compatible with the IP SA split-horizon check, the AR- | |||
REPLICATOR MAY keep the original received tunnel IP SA when | REPLICATOR MAY keep the original received tunnel IP SA when | |||
replicating packets to a remote AR-LEAF or RNVE. This will allow DF | replicating packets to a remote AR-LEAF or RNVE. This will allow DF | |||
skipping to change at page 21, line 7 ¶ | skipping to change at page 23, line 10 ¶ | |||
Ethernet-Segments defined on AR-LEAF nodes. "Local-Bias" is | Ethernet-Segments defined on AR-LEAF nodes. "Local-Bias" is | |||
recommended in this case, as in the case of VXLAN or NVGRE explained | recommended in this case, as in the case of VXLAN or NVGRE explained | |||
above. The "Local-Bias" and tunnel IP SA preservation mechanisms | above. The "Local-Bias" and tunnel IP SA preservation mechanisms | |||
provide the required split-horizon behavior in non-selective or | provide the required split-horizon behavior in non-selective or | |||
selective AR. | selective AR. | |||
Note that if the AR-REPLICATOR implementation keeps the received | Note that if the AR-REPLICATOR implementation keeps the received | |||
tunnel IP SA, the use of uRPF (unicast Reverse Path Forwarding) | tunnel IP SA, the use of uRPF (unicast Reverse Path Forwarding) | |||
checks in the IP fabric based on the tunnel IP SA MUST be disabled. | checks in the IP fabric based on the tunnel IP SA MUST be disabled. | |||
8.2. Ethernet Segments on AR-REPLICATOR nodes | 9.2. Ethernet Segments on AR-REPLICATOR nodes | |||
Ethernet Segments associated to one or more AR-REPLICATOR nodes | Ethernet Segments associated to one or more AR-REPLICATOR nodes | |||
SHOULD follow "Local-Bias" procedures for EVPN all-active multi- | SHOULD follow "Local-Bias" procedures for EVPN all-active multi- | |||
homing, as follows: | homing, as follows: | |||
o For BUM traffic received on a local AR-REPLICATOR's AC, "Local- | o For BUM traffic received on a local AR-REPLICATOR's AC, "Local- | |||
Bias" procedures as in [RFC8365] SHOULD be followed. | Bias" procedures as in [RFC8365] SHOULD be followed. | |||
o For BUM traffic received on an AR-REPLICATOR overlay tunnel with | o For BUM traffic received on an AR-REPLICATOR overlay tunnel with | |||
AR-IP as the IP DA, "Local-Bias" SHOULD also be followed. That is, | AR-IP as the IP DA, "Local-Bias" SHOULD also be followed. That is, | |||
traffic received with AR-IP as IP DA will be treated as though it | traffic received with AR-IP as IP DA will be treated as though it | |||
had been received on a local AC that is part of the ES and will be | had been received on a local AC that is part of the ES and will be | |||
forwarded to all local ES, irrespective of their DF or NDF state. | forwarded to all local ES, irrespective of their DF or NDF state. | |||
o BUM traffic received on an AR-REPLICATOR overlay tunnel with IR-IP | o BUM traffic received on an AR-REPLICATOR overlay tunnel with IR-IP | |||
as the IP DA, will follow regular [RFC8365] "Local-Bias" rules and | as the IP DA, will follow regular [RFC8365] "Local-Bias" rules and | |||
will not be forwarded to local ESes that are shared with the AR-LEF | will not be forwarded to local ESes that are shared with the AR-LEF | |||
or AR-REPLICATOR originating the traffic. | or AR-REPLICATOR originating the traffic. | |||
9. Benefits of the optimized-IR solution | 10. Benefits of the optimized-IR solution | |||
A solution for the optimization of Ingress Replication in EVPN is | A solution for the optimization of Ingress Replication in EVPN is | |||
described in this document (optimized-IR). The solution brings the | described in this document (optimized-IR). The solution brings the | |||
following benefits: | following benefits: | |||
o Optimizes the multicast forwarding in low-performance NVEs, by | o Optimizes the multicast forwarding in low-performance NVEs, by | |||
relaying the replication to high-performance NVEs (AR-REPLICATORs) | relaying the replication to high-performance NVEs (AR-REPLICATORs) | |||
and while preserving the packet ordering for unicast applications. | and while preserving the packet ordering for unicast applications. | |||
o Reduces the flooded traffic in NVO networks where some NVEs do not | o Reduces the flooded traffic in NVO networks where some NVEs do not | |||
need broadcast/multicast and/or unknown unicast traffic. | need broadcast/multicast and/or unknown unicast traffic. | |||
o It is fully compatible with existing EVPN implementations and EVPN | o It is fully compatible with existing EVPN implementations and EVPN | |||
functions for NVO overlay tunnels. Optimized-IR NVEs and regular | functions for NVO overlay tunnels. Optimized-IR NVEs and regular | |||
NVEs can be even part of the same EVI. | NVEs can be even part of the same EVI. | |||
o It does not require any PIM-based tree in the NVO core of the | o It does not require any PIM-based tree in the NVO core of the | |||
network. | network. | |||
10. Conventions used in this document | ||||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | ||||
"OPTIONAL" in this document are to be interpreted as described in BCP | ||||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
capitals, as shown here. | ||||
11. Security Considerations | 11. Security Considerations | |||
This section will be added in future versions. | This section will be added in future versions. | |||
12. IANA Considerations | 12. IANA Considerations | |||
IANA has allocated the following Border Gateway Protocol (BGP) | IANA has allocated the following Border Gateway Protocol (BGP) | |||
Parameters: | Parameters: | |||
1) Allocation in the P-Multicast Service Interface Tunnel (PMSI | 1) Allocation in the P-Multicast Service Interface Tunnel (PMSI | |||
Tunnel) Tunnel Types registry: | Tunnel) Tunnel Types registry: | |||
Value Meaning Reference | Value Meaning Reference | |||
0x0A Assisted-Replication Tunnel [This document] | 0x0A Assisted-Replication Tunnel [This document] | |||
2) Allocations in the P-Multicast Service Interface (PMSI) Tunnel | 2) Allocations in the P-Multicast Service Interface (PMSI) Tunnel | |||
Attribute Flags registry: | Attribute Flags registry: | |||
Value Name Reference | Value Name Reference | |||
3-4 Assisted-Replication Type (T) [This document] | 3-4 Assisted-Replication Type (T) [This document] | |||
5 Broadcast and Multicast (BM) [This document] | 5 Broadcast and Multicast (BM) [This document] | |||
6 Unknown (U) [This document] | 6 Unknown (U) [This document] | |||
13. Terminology | 13. References | |||
AC: Attachment Circuit | ||||
Regular-IR: Refers to Regular Ingress Replication, where the source | ||||
NVE/PE sends a copy to each remote NVE/PE part of the EVI. | ||||
AR-IP: IP address owned by the AR-REPLICATOR and used to | ||||
differentiate the ingress traffic that must follow the AR | ||||
procedures. | ||||
IR-IP: IP address used for Ingress Replication as in [RFC7432]. | ||||
AR-VNI: VNI advertised by the AR-REPLICATOR along with the | ||||
Replicator-AR route. It is used to identify the ingress | ||||
packets that must follow AR procedures ONLY in the Single-IP | ||||
AR-REPLICATOR case. | ||||
IR-VNI: VNI advertised along with the RT-3 for IR. | ||||
AR forwarding mode: for an AR-LEAF, it means sending an AC BM packet | ||||
to a single AR-REPLICATOR with tunnel destination IP AR-IP. | ||||
For an AR-REPLICATOR, it means sending a BM packet to a | ||||
selective number or all the overlay tunnels when the packet | ||||
was previously received from an overlay tunnel. | ||||
IR forwarding mode: it refers to the Ingress Replication behavior | ||||
explained in [RFC7432]. It means sending an AC BM packet copy | ||||
to each remote PE/NVE in the EVI and sending an overlay BM | ||||
packet only to the ACs and not other overlay tunnels. | ||||
PTA: PMSI Tunnel Attribute | ||||
RT-3: EVPN Route Type 3, Inclusive Multicast Ethernet Tag route | ||||
RT-11: EVPN Route Type 11, Leaf Auto-Discovery (AD) route | ||||
14. References | ||||
14.1 Normative References | 13.1 Normative References | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March | Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March | |||
1997, <https://www.rfc-editor.org/info/rfc2119>. | 1997, <https://www.rfc-editor.org/info/rfc2119>. | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, | |||
<https://www.rfc-editor.org/info/rfc8174>. | <https://www.rfc-editor.org/info/rfc8174>. | |||
[RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP | [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP | |||
skipping to change at page 24, line 5 ¶ | skipping to change at page 25, line 12 ¶ | |||
[RFC7902] Rosen, E. and T. Morin, "Registry and Extensions for P- | [RFC7902] Rosen, E. and T. Morin, "Registry and Extensions for P- | |||
Multicast Service Interface Tunnel Attribute Flags", RFC 7902, DOI | Multicast Service Interface Tunnel Attribute Flags", RFC 7902, DOI | |||
10.17487/RFC7902, June 2016, <https://www.rfc- | 10.17487/RFC7902, June 2016, <https://www.rfc- | |||
editor.org/info/rfc7902>. | editor.org/info/rfc7902>. | |||
[EVPN-BUM] Zhang et al., "Updates on EVPN BUM Procedures", draft- | [EVPN-BUM] Zhang et al., "Updates on EVPN BUM Procedures", draft- | |||
ietf-bess-evpn-bum-procedure-updates-04.txt, work in progress, June | ietf-bess-evpn-bum-procedure-updates-04.txt, work in progress, June | |||
2018. | 2018. | |||
14.2 Informative References | 13.2 Informative References | |||
[RFC8365] Sajassi et al., "A Network Virtualization Overlay Solution | [RFC8365] Sajassi et al., "A Network Virtualization Overlay Solution | |||
Using Ethernet VPN (EVPN)", RFC 8365, March, 2018. | Using Ethernet VPN (EVPN)", RFC 8365, March, 2018. | |||
15.0 Contributors | 14. Contributors | |||
In addition to the names in the front page, the following co-authors | In addition to the names in the front page, the following co-authors | |||
also contributed to this document: | also contributed to this document: | |||
Wim Henderickx | Wim Henderickx | |||
Nokia | Nokia | |||
Kiran Nagaraj | Kiran Nagaraj | |||
Nokia | Nokia | |||
skipping to change at page 24, line 33 ¶ | skipping to change at page 25, line 40 ¶ | |||
Nischal Sheth | Nischal Sheth | |||
Juniper Networks | Juniper Networks | |||
Aldrin Isaac | Aldrin Isaac | |||
Juniper | Juniper | |||
Mudassir Tufail | Mudassir Tufail | |||
Citibank | Citibank | |||
16. Acknowledgments | 15. Acknowledgments | |||
The authors would like to thank Neil Hart, David Motz, Dai Truong, | The authors would like to thank Neil Hart, David Motz, Dai Truong, | |||
Thomas Morin, Jeffrey Zhang and Shankar Murthy for their valuable | Thomas Morin, Jeffrey Zhang and Shankar Murthy for their valuable | |||
feedback and contributions. | feedback and contributions. | |||
17. Authors' Addresses | 16. Authors' Addresses | |||
Jorge Rabadan (Editor) | Jorge Rabadan (Editor) | |||
Nokia | Nokia | |||
777 E. Middlefield Road | 777 E. Middlefield Road | |||
Mountain View, CA 94043 USA | Mountain View, CA 94043 USA | |||
Email: jorge.rabadan@nokia.com | Email: jorge.rabadan@nokia.com | |||
Senthil Sathappan | Senthil Sathappan | |||
Nokia | Nokia | |||
Email: senthil.sathappan@nokia.com | Email: senthil.sathappan@nokia.com | |||
End of changes. 55 change blocks. | ||||
171 lines changed or deleted | 197 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |