--- 1/draft-ietf-bess-evpn-virtual-eth-segment-03.txt 2019-01-18 15:13:09.045393138 -0800 +++ 2/draft-ietf-bess-evpn-virtual-eth-segment-04.txt 2019-01-18 15:13:09.093394311 -0800 @@ -2,33 +2,33 @@ Internet Working Group A. Sajassi Internet Draft P. Brissette Category: Standards Track Cisco R. Schell Verizon J. Drake Juniper J. Rabadan Nokia -Expires: July 09, 2019 January 09, 2019 +Expires: July 18, 2019 January 18, 2019 EVPN Virtual Ethernet Segment - draft-ietf-bess-evpn-virtual-eth-segment-03 + draft-ietf-bess-evpn-virtual-eth-segment-04 Abstract EVPN and PBB-EVPN introduce a family of solutions for multipoint Ethernet services over MPLS/IP network with many advanced features among which their multi-homing capabilities. These solutions define two types of multi-homing for an Ethernet Segment (ES): 1) Single- Active and 2) All-Active, where an Ethernet Segment is defined as a - set of links between the multi-homed device/network and the set of PE + set of links between the multi-homed device/network and a set of PE devices that they are connected to. Some Service Providers want to extend the concept of the physical links in an ES to Ethernet Virtual Circuits (EVCs) where many of such EVCs can be aggregated on a single physical External Network-to- Network Interface (ENNI). An ES that consists of a set of EVCs instead of physical links is referred to as a virtual ES (vES). This draft describes the requirements and the extensions needed to support vES in EVPN and PBB-EVPN. @@ -109,35 +109,35 @@ 12. Informative References . . . . . . . . . . . . . . . . . . . . 22 13. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 22 1. Introduction [RFC7432] and [RFC7623] introduce a family of solutions for multipoint Ethernet services over MPLS/IP network with many advanced features among which their multi-homing capabilities. These solutions define two types of multi-homing for an Ethernet Segment (ES): 1) Single-Active and 2) All-Active, where an Ethernet Segment is defined - as a set of links between the multi-homed device/network and the set - of PE devices that they are connected to. + as a set of links between the multi-homed device/network and a set of + PE devices that they are connected to. This document extends the Ethernet Segment concept so that an ES can be associated to a set of EVCs (e.g., VLANs) or other objects such as MPLS Label Switch Paths (LSPs) or Pseudowires (PWs). 1.1 Virtual Ethernet Segments in Access Ethernet Networks Some Service Providers (SPs) want to extend the concept of the physical links in an ES to Ethernet Virtual Circuits (EVCs) where many of such EVCs (e.g., VLANs) can be aggregated on a single physical External Network-to-Network Interface (ENNI). An ES that consists of a set of EVCs instead of physical links is referred to as - a virtual ES (vES). Figure below depicts two PE devices (PE1 and PE2) + a virtual ES (vES). Figure 1 depicts two PE devices (PE1 and PE2) each with an ENNI where a number of vES's are aggregated on - each of which through its associated EVC. Carrier Ethernet +-----+ Network | CE11|EVC1 +---------+ <---- EVPN Network -----> +-----+ \ | | +---+ Cust. A \-0=========0--ENNI1| | +-----+ | | ENNI1| | +-------+ +---+ @@ -153,45 +153,46 @@ | |EVC5--0=========0--ENNI2| | +-------+ +---+ +-----+ | | +---+ Cust. C +---------+ /\ /\ || || ENNI EVCs Interface <--------802.1Q----------> <-802.1Q-> Figure 1: DHD/DHN (both SA/AA) and SH on same ENNI - E-NNIs are commonly used to reach off-network / out-of-franchise + ENNIs are commonly used to reach off-network / out-of-franchise customer sites via independent Ethernet access networks or third- - party Ethernet Access Providers (EAP) (see Figure 1). E-NNIs can + party Ethernet Access Providers (EAP) (see Figure 1). ENNIs can aggregate traffic from hundreds to thousands of vES's; where, each vES is represented by its associated EVC on that ENNI. As a result, ENNIs and their associated EVCs are a key element of SP off-networks that are carefully designed and closely monitored. In order to meet customer's Service Level Agreements (SLA), SPs build redundancy via multiple EVPN PEs and across multiple ENNIs (as shown in Figure 1) where a given vES can be multi-homed to two or more EVPN PE devices (on two or more ENNIs) via their associated EVCs. Just like physical ES's in [RFC7432] and [RFC7623] solutions, these vES's can be single-homed or multi-homed ES's and when multi-homed, then can operate in either Single-Active or All-Active redundancy modes. In a typical SP off-network scenario, an ENNI can be associated with several thousands of single-homed vES's, several hundreds of Single- Active vES's and it may also be associated with tens or hundreds of All-Active vES's. 1.2 Virtual Ethernet Segments in Access MPLS Networks Other Service Providers (SPs) want to extend the concept of the physical links in an ES to individual Pseudowires (PWs) or to MPLS - Label Switched Paths (LSPs) in Access MPLS networks. Figure 2 - illustrates this concept. + Label Switched Paths (LSPs) in Access MPLS networks - i.e., a vES + consisting of a set of PWs or a set of LSPs. Figure 2 illustrates + this concept. MPLS Aggregation Network +-----+ +----------------+ <---- EVPN Network -----> | CE11|EVC1 | | +-----+ \+AG1--+ PW1 +-----+ Cust. A -0----|===========| | +-----+ | ---+===========| | +-------+ +---+ | CE12|EVC2-0/ | PW2 /\ | PE1 +---+ | | | +-----+ ++---+ ==||=| | | +---+PE3+- @@ -229,32 +230,33 @@ virtual ES can be defined for LSP1 and LSP2. This vES will be shared by two separate EVIs in the EVPN network. In some cases, this aggregation of PWs into common LSPs may not be possible. For instance, if PW3 were terminated into a third PE, e.g. PE3, instead of PE1, the vES would need to be defined on a per individual PW on each PE, i.e. PW3 and PW5 would belong to ES-1, whereas PW4 and PW6 would be associated to ES-2. For MPLS/IP access networks where a vES represents a set of PWs or - LSPs, this document extended Single-Active multi-homing procedures of + LSPs, this document extends Single-Active multi-homing procedures of [RFC7432] and [7623] to vES. The vES extension to All-Active multi- homing is outside of the scope of this document for MPLS/IP access networks. This draft describes requirements and the extensions needed to support vES in [RFC7432] and [RFC7623]. Section 3 lists the set of - requirements for virtual ES's. Section 4 describes the solution for - [RFC7432], [RFC7623], and [RFC8214] to meet these requirements. - Section 5 describes the failure handling and recovery for Virtual - ES's in [RFC7432] and [RFC7623]. Section 6 covers scalability and - fast convergence required for Virtual ES's in [RFC7432] and + requirements for vES's. Section 4 describes extensions for vES that + are applicable to EVPN solutions including [RFC7432], [RFC7623], and + [RFC8214]. Furthermore, these extensions meet the requirements + described in section 3. Section 5 describes the failure handling and + recovery for vES's in [RFC7432] and [RFC7623]. Section 6 covers + scalability and fast convergence required for vES's in [RFC7432] and [RFC7623]. 2. Terminology AC: Attachment Circuit BEB: Backbone Edge Bridge B-MAC: Backbone MAC Address CE: Customer Edge CFM: Connectivity Fault Management C-MAC: Customer/Client MAC Address @@ -355,28 +356,27 @@ 3.4. EVC Service Types A physical port (e.g., ENNI) of a PE can aggregate many EVCs each of which is associated with a vES. Furthermore, an EVC may carry one or more VLANs. Typically, an EVC carries a single VLAN and thus it is associated with a single broadcast domain. However, there is no restriction on an EVC to carry more than one VLAN. (R4a) An EVC can be associated with a single broadcast domain - e.g., VLAN-based service or VLAN bundle service - (R4b) An EVC MAY be associated with several broadcast domains - e.g., VLAN-aware bundle service In the same way, a PE can aggregate many LSPs and PWs. In the case of individual PWs per vES, typically a PW is associated with a single broadcast domain, but there is no restriction on the PW to carry more - than one VLAN if the PW is defined as vc-type VLAN. + than one VLAN if the PW is of type Raw mode. (R4c) A PW can be associated with a single broadcast domain - e.g., VLAN-based service or VLAN bundle service. (R4d) An PW MAY be associated with several broadcast domains - e.g., VLAN-aware bundle service." 3.5. Designated Forwarder (DF) Election Section 8.5 of [RFC7432] describes the default procedure for DF @@ -452,53 +451,53 @@ (R8a) There SHOULD be a mechanism equivalent to EVPN mass-withdraw such that upon an ENNI failure, only a single BGP message is needed to indicate to the remote PEs to trigger DF election for all impacted vES associated with that ENNI. 4. Solution Overview The solutions described in [RFC7432] and [RFC7623] are leveraged as is with one simple modification and that is the ESI assignment is performed for a group of EVCs or LSPs/PWs instead of a group of - links. In other words, the ESI is associated with a virtual ES (vES) - and that's why it will be referred to as vESI. + physical links. In other words, the ESI is associated with a virtual + ES (vES) and that's why it will be referred to as vESI. For the EVPN solution, everything basically remains the same except for the handling of physical port failure where many vES's can be impacted. Section 5.1 and 5.3 below describe the handling of physical port/link failure for EVPN. In a typical multi-homed operation, MAC addresses are learned behind a vES are advertised with the ESI corresponding to the vES (i.e., vESI). EVPN aliasing and mass- withdraw operations are performed with respect to vES. In other words, the Ethernet A-D routes for these operations are advertised with vESI instead of ESI. For PBB-EVPN solution, the main change is with respect to the BMAC address assignment which is performed similar to what is described in section 7.2.1.1 of [RFC7623] with the following refinements: - - One shared BMAC address is used per PE for the single-homed vES's. - In other words, a single BMAC is shared for all single-homed vES's on - that PE. + - One shared BMAC address SHOULD used per PE for the single-homed + vES's. In other words, a single BMAC is shared for all single-homed + vES's on that PE. - - One shared BMAC address should be used per PE per physical port + - One shared BMAC address SHOULD be used per PE per physical port (e.g., ENNI) for the Single-Active vES's. In other words, a single - BMAC is shared for all Single-Active vES's that shared the same ENNI. + BMAC is shared for all Single-Active vES's that share the same ENNI. - - One shared BMAC address can be used for all Single-Active vES's on + - One shared BMAC address MAY be used for all Single-Active vES's on that PE. - - One BMAC address is used per EVC per physical port per PE for each + - One BMAC address SHOULD be used per set of EVCs representing an All-Active multi-homed vES. In other words, a single BMAC address is used per vES for All-Active multi-homing scenarios. - - A single BMAC address may also be used per vES per PE for Single- + - A single BMAC address MAY also be used per vES per PE for Single- Active multi-homing scenarios. BEB +--------------+ BEB || | | || \/ | | \/ +----+ EVC1 +----+ | | +----+ +----+ | CE1|------| | | | | |---| CE2| +----+\ | PE1| | IP/MPLS | | PE3| +----+ \ +----+ | Network | +----+ \ | | @@ -551,34 +550,35 @@ unblock traffic for that EVPN instance. Note that the DF PE unblocks all traffic in both ingress and egress directions for Single-Active vES and unblocks multi-destination in egress direction for All-Active Multi-homed vES. All non-DF PEs block all traffic in both ingress and egress directions for Single-Active vES and block multi-destination traffic in the egress direction for All-Active multi-homed vES. In the case of an EVC failure, the affected PE withdraws its Ethernet Segment route if there are no more EVCs associated to the vES in the PE. This will re-trigger the DF Election procedure on all the PEs in - the RG. For PE node failure, or upon PE commissioning or - decommissioning, the PEs re-trigger the DF Election Procedure across - all affected vES's. In case of a Single-Active multi-homing, when a - service moves from one PE in the Redundancy Group to another PE as a - result of DF re-election, the PE, which ends up being the elected DF - for the service, SHOULD trigger a MAC address flush notification - towards the associated vES. This can be done, for e.g. using IEEE - 802.1ak MVRP 'new' declaration. + the Redundancy Group. For PE node failure, or upon PE commissioning + or decommissioning, the PEs re-trigger the DF Election Procedure + across all affected vES's. In case of a Single-Active multi-homing, + when a service moves from one PE in the Redundancy Group to another + PE as a result of DF re-election, the PE, which ends up being the + elected DF for the service, SHOULD trigger a MAC address flush + notification towards the associated vES. This can be done, for e.g. + using IEEE 802.1ak MVRP 'new' declaration. For LSP and PW based vES, the non-DF PE SHOULD signal PW-status - 'standby' signaling to the AG PE, and the new DF MAY send an LDP MAC - withdraw message as a MAC address flush notification. It should be - noted that the PW-status is signaled for the scenarios where there is - a one-to-one mapping between EVI/BD and the PW. + 'standby' signaling to the Aggregation PE (e.g., AG PE in Figure 2), + and the new DF PE MAY send an LDP MAC withdraw message as a MAC + address flush notification. It should be noted that the PW-status is + signaled for the scenarios where there is a one-to-one mapping + between EVI/BD and the PW. 5. Failure Handling & Recovery There are a number of failure scenarios to consider such as: A: CE Uplink Port Failure B: Ethernet Access Network Failure C: PE Access-facing Port or link Failure D: PE Node Failure E: PE isolation from IP/MPLS network @@ -616,35 +616,36 @@ +-----+ | | | | +-------+ +-----+ +---+ /\ /\ /\ || || || A C E Figure 4: Failure Scenarios A,B,C,D and E 5.1. Failure Handling for Single-Active vES in EVPN - When a PE connected to a Single-Active multi-homed Ethernet Segment - loses connectivity to the segment, due to link or port failure, it - signals the remote PE to flush all CMAC addresses associated with - that Ethernet Segment. This is done by advertising a mass-withdraw - message using Ethernet A-D per-ES route. To be precise, there is no - MAC flush per-se if there is only one backup PE for a given ES - - i.e., only an update of the forwarding entries per backup-path - procedure in [RFC 7432]. + When a DF PE connected to a Single-Active multi-homed Ethernet + Segment loses connectivity to the segment, due to link or port + failure, it signals to the remote PEs to withdraw all MAC addresses + associated with that Ethernet Segment. This is done by advertising a + mass-withdraw message using Ethernet A-D per-ES route. It should be + noted that for dual-homing use cases where there is only a single + backup path, MAC withdraw can be avoided by the remote PEs as they + can simply update their nexthop associated with the affected MAC + entries to the backup path per procedure described in section 8.2 of + [RFC7432]. - In case of an EVC failure that impacts a single vES, the exact same + In case of an EVC failure which impacts a single vES, the exact same EVPN procedure is used. In this case, the message using Ethernet A-D - per ES route carries the vESI representing the vES which is in turn + per ES route carries the vESI representing the vES which in turn is associated with the failed EVC. The remote PEs upon receiving this message perform the same procedures outlined in section 8.2 of - [RFC7432]. 5.2. EVC Failure Handling for Single-Active vES in PBB-EVPN When a PE connected to a Single-Active multi-homed Ethernet Segment loses connectivity to the segment, due to link or port failure, it signals the remote PE to flush all CMAC addresses associated with that Ethernet Segment. This is done by advertising a BMAC route along with MAC Mobility Extended community. @@ -669,45 +670,45 @@ CMACs corresponding to the advertised BMAC only across the advertised list of I-ISIDs. The new I-SID Extended Community provides a way to encode upto 24 I- SIDs in each Extended Community if the impacted I-SIDs are sequential (the base I-SID value plus the next 23 I-SID values). If the number of I-SIDs associated with a failed EVC is large or if the affected I- SIDs are not sequential, then multiple I-SID Extended Communities can be sent along with the flush message. However, if the number of affected I-SIDs is very large such that the corresponding I-SID - Extended Communities don't fit in a single BGP attribute, then the - EVC failure can be treated as a port failure and the procedures of - section 5.4 can be exercised (i.e., a single BGP flush message + Extended Communities cannot be fitted in a single BGP attribute, then + the EVC failure can be treated as a port failure and the procedures + of section 5.4 can be exercised (i.e., a single BGP flush message without the I-SID list can be transmitted). When the BGP flush message is transmitted without the I-SID list, then it instructs the receiving PEs to flush CMACs associated with that BMAC across all I- SIDs. There can be scenarios (although unlikely) where multiple EVCs within the same physical port can fail within a short time resulting in the PE advertising multiple BGP flush messages each with their own list of I-SIDs; however, the route reflector receiving these messages will only send the last flush message. This results in PEs receiving such flush messages not to properly flush all the affected I-SIDs. In order to address such scenarios, a timer T1 is started upon an EVC1 failure on the advertising PE. If there is another EVC2 failure within T1, affected I-SIDs are aggregated for both EVC1 and EVC2 to be sent along the new flush message. Furthermore when EVC2 failure occurs, another timer T2 (with the same value as T1) is started to keep track of the affected I-SIDs for EVC2. Such I-SID aggregation may result in multiple flushing for the same I-SID(s) on the - receiving PEs. The default value for this timer T is 30 seconds. + receiving PEs. The default value for this timer T is 10 seconds. The I-SID dependent flushing mechanism described in this section is - backward compatible for the PEs only supporting [RFC7623] such that + also backward compatible for the PEs supporting [RFC7623] such that the PEs that don't understand the I-SID list (i.e., the new I-SID Extended Community) simply ignore it and default to flushing all the I-SIDs for the B-MAC - i.e., the PEs default to per-port flushing described in section 5.4. The above BMAC route that is advertised with the MAC Mobility Extended Community, can either represent the MAC address of the physical port that the failed EVC is associated with, or it can represent the MAC address of the PE. In the latter case, this is the dedicated per-PE MAC address used for all Single-Active vES's on that @@ -740,37 +741,42 @@ address of the port and the 3-octet local discriminator field set to 0xFFFFFF. This mass-withdraw route is advertised with a list of Route Targets corresponding to the impacted service instances. If the number of Route Targets is more than they can fit into a single attribute, then a set of Ethernet A-D per ES routes are advertised. The remote PEs upon receiving this message, realize that this is a special mass-withdraw message and they access the list of the vES's for the specified color. Next, they initiate mass-withdraw procedure for each of the vES's in the list. + In scenarios where a logical ENNI is used the above procedure equally + applies. The logical ENNI is represented by type 3 ESI and the MAC + address used in the ENNI's ESI is used as a color for vES's as + described above. + 5.4. Port Failure Handling for Single-Active vES's in PBB-EVPN When a large number of EVCs are aggregated via a single physical port on a PE; where each EVC corresponds to a vES, then the port failure impacts all the associated EVCs and their corresponding vES's. If the number of EVCs corresponding to the Single-Active vES's for that physical port is in thousands, then thousands of service instances (I-SIDs) are impacted. In such failure scenarios, the following two MAC flushing mechanisms per [RFC7623] can be performed. 1) If the MAC address of the physical port is used for PBB encapsulation as BMAC SA, then upon the port failure, the PE MUST use - the EVPN MAC route withdrawal message to signal the flush + the EVPN MAC route withdrawal message to signal the flush. 2) If the PE shared MAC address is used for PBB encapsulation as BMAC SA, then upon the port failure, the PE MUST re-advertise this MAC - route with the MAC Mobility Extended Community to signal the flush + route with the MAC Mobility Extended Community to signal the flush. The first method is recommended because it reduces the scope of flushing the most. 5.5. Fast Convergence in PBB-EVPN As described above, when a large number of EVCs are aggregated via a physical port on a PE; where each EVC corresponds to a vES, then the port failure impacts all the associated EVCs and their corresponding vES's. Two actions must be taken as the result of such port failure: @@ -790,37 +797,27 @@ In order to devise such fast convergence mechanism that can be triggered via a single BGP message, all vES's associated with a given physical port (e.g., ENNI) are colored with the same color representing that physical port. The MAC address of the physical port is used for this coloring purposes and when the PE advertises an ES route for a vES associated with that physical port, it advertises it with an EVPN Router's MAC Extended Community indicating the color of that port. The receiving PEs take note of this color and for each such color, - they create a list of vES's associated with this color (with this MAC - address). Now, when a port failure occurs, the impacted PE needs to - notify the other PEs of this color so that these PEs can identify all - the impacted vES's associated with that color (from the above list) - and re-execute DF election procedures for all the impacted vES's. - - In PBB-EVPN, there are two ways to convey this color to other PEs - upon a port failure - one corresponding to each method for signaling - flush message as described in section 5.4. If for PBB encapsulation, - the MAC address of the physical port is used as BMAC SA, then upon - the port failure, the PE sends MAC withdrawal message with the MAC - address of the failed port as the color. However, if for PBB - encapsulation, the shared MAC address of the PE (dedicated for all - Single-Active vES's) is used as BMAC SA, then upon the port failure, - the PE re-advertises the MAC route (that carries the shared BMAC) - along with this new EVPN Router's MAC Extended Community to indicate - the color along with MAC Mobility Extended Community. + they create a list of vES's associated with this color (i.e., + associated with this MAC address). Now, when a port failure occurs, + the impacted PE needs to notify the other PEs of this color so that + these PEs can identify all the impacted vES's associated with this + color (from the above list) and re-execute DF election procedures for + all the impacted vES's. This is done by withdrawing the BMAC address + associated with the failed port. +-----+ +----+ | | +---+ | CE1|AC1--0=====0--ENNI1| | +-------+ | |AC2--0 | |PE1|--| | +----+ |\ ==0--ENNI2| | | | | \/ | +---+ | | | /\ | |IP/MPLS| +----+ |/ \ | +---+ |Network| +---+ +---+ | CE2|AC4--0 =0--ENNI3| | | |---|PE4|--|CE4| @@ -840,22 +837,22 @@ convergence using this color in more details: 1- When a vES is configured, the PE colors the vES with the MAC address of the corresponding physical port and advertises the Ethernet Segment route for this vES with this color. 2- All other PEs (in the redundancy group) take note of this color and add the vES to the list for this color. 3- Upon the occurrence of a port failure (e.g., an ENNI failure), the - PE sends the flush message in one of the two ways described above - indicating this color. The PE should prioritize sending this flush + PE sends the flush message by withdrawing the BMAC address associated + with the failed port. The PE should prioritize sending this flush message over ES route withdrawal messages of impacted vES's. 4- On reception of the flush message, other PEs use this info to flush their impacted CMACs and to initiate DF election procedures across all their affected vES's. 5- The PE with the physical port failure (ENNI failure), also sends ES route withdrawal for every impacted vES's. The other PEs upon receiving these messages, clear up their BGP tables. It should be noted the ES route withdrawal messages are not used for executing DF