--- 1/draft-ietf-bess-evpn-igmp-mld-proxy-02.txt 2019-06-09 21:13:11.049851432 -0700 +++ 2/draft-ietf-bess-evpn-igmp-mld-proxy-03.txt 2019-06-09 21:13:11.113853060 -0700 @@ -2,24 +2,24 @@ BESS Working Group Ali Sajassi Internet-Draft Samir Thoria Intended Status: Standards Track Cisco Keyur Patel Derek Yeung Arrcus John Drake Wen Lin Juniper -Expires: December 24, 2018 June 24, 2018 +Expires: December 12, 2019 June 10, 2019 IGMP and MLD Proxy for EVPN - draft-ietf-bess-evpn-igmp-mld-proxy-02 + draft-ietf-bess-evpn-igmp-mld-proxy-03 Abstract Ethernet Virtual Private Network (EVPN) solution [RFC 7432] is becoming pervasive in data center (DC) applications for Network Virtualization Overlay (NVO) and DC interconnect (DCI) services, and in service provider (SP) applications for next generation virtual private LAN services. This draft describes how to support efficiently endpoints running @@ -60,82 +60,88 @@ include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 5 2 IGMP Proxy . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1 Proxy Reporting . . . . . . . . . . . . . . . . . . . . . . 6 - 2.1.1 IGMP Membership Report Advertisement in BGP . . . . . . 6 + 2.1.1 IGMP Membership Report Advertisement in BGP . . . . . . 7 2.1.1 IGMP Leave Group Advertisement in BGP . . . . . . . . . 8 2.2 Proxy Querier . . . . . . . . . . . . . . . . . . . . . . . 9 - 3 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 + 3 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1 PE with only attached hosts/VMs for a given subnet . . . . . 10 3.2 PE with mixed of attached hosts/VMs and multicast source . . 11 3.3 PE with mixed of attached hosts/VMs, multicast source and router . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4 All-Active Multi-Homing . . . . . . . . . . . . . . . . . . . . 11 - 4.1 Local IGMP Join Synchronization . . . . . . . . . . . . . . 12 - 4.2 Local IGMP Leave Group Synchronization . . . . . . . . . . . 13 + 4.1 Local IGMP Join Synchronization . . . . . . . . . . . . . . 11 + 4.2 Local IGMP Leave Group Synchronization . . . . . . . . . . . 12 4.2.1 Remote Leave Group Synchronization . . . . . . . . . . . 13 - 4.2.2 Common Leave Group Synchronization . . . . . . . . . . . 14 + 4.2.2 Common Leave Group Synchronization . . . . . . . . . . . 13 + 4.3 Mass Withdraw of Multicast join Sync route in case of + failure . . . . . . . . . . . . . . . . . . . . . . . . . . 14 5 Single-Active Multi-Homing . . . . . . . . . . . . . . . . . . . 14 6 Selective Multicast Procedures for IR tunnels . . . . . . . . . 14 7 BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 15 7.1 Selective Multicast Ethernet Tag Route . . . . . . . . . . . 15 7.1.1 Constructing the Selective Multicast Ethernet Tag route . . . . . . . . . . . . . . . . . . . . . . . . . 17 - 7.2 IGMP Join Synch Route . . . . . . . . . . . . . . . . . . . 18 - 7.2.1 Constructing the IGMP Join Synch Route . . . . . . . . 19 - 7.3 IGMP Leave Synch Route . . . . . . . . . . . . . . . . . . . 20 - 7.3.1 Constructing the IGMP Leave Synch Route . . . . . . . . 22 - 7.4 Multicast Flags Extended Community . . . . . . . . . . . . . 23 - 7.5 EVI-RT Extended Community . . . . . . . . . . . . . . . . . 24 - 7.6 Rewriting of RT ECs and EVI-RT ECs by ASBRs . . . . . . . . 26 - 8 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . 26 - 9 Security Considerations . . . . . . . . . . . . . . . . . . . . 26 - 10 IANA Considerations . . . . . . . . . . . . . . . . . . . . . 26 - 11 References . . . . . . . . . . . . . . . . . . . . . . . . . . 27 - 11.1 Normative References . . . . . . . . . . . . . . . . . . . 27 - 11.2 Informative References . . . . . . . . . . . . . . . . . . 27 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 28 + 7.1.2 Default Selective Multicast Route . . . . . . . . . . . 18 + 7.2 Multicast Join Synch Route . . . . . . . . . . . . . . . . 19 + 7.2.1 Constructing the Multicast Join Synch Route . . . . . . 21 + 7.3 Multicast Leave Synch Route . . . . . . . . . . . . . . . . 22 + 7.3.1 Constructing the Multicas Leave Synch Route . . . . . . 24 + 7.4 Multicast Flags Extended Community . . . . . . . . . . . . . 25 + 7.5 EVI-RT Extended Community . . . . . . . . . . . . . . . . . 26 + 7.6 Rewriting of RT ECs and EVI-RT ECs by ASBRs . . . . . . . . 28 + 8 IGMP/MLD Immediate leave . . . . . . . . . . . . . . . . . . . 28 + 9 IGMP Version 1 membership request . . . . . . . . . . . . . . . 29 + 10 Security Considerations . . . . . . . . . . . . . . . . . . . 29 + 11 IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 + 12 References . . . . . . . . . . . . . . . . . . . . . . . . . . 30 + 12.1 Normative References . . . . . . . . . . . . . . . . . . . 30 + 12.2 Informative References . . . . . . . . . . . . . . . . . . 30 + 13 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 30 + 14 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 30 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 30 1 Introduction Ethernet Virtual Private Network (EVPN) solution [RFC 7432] is becoming pervasive in data center (DC) applications for Network Virtualization Overlay (NVO) and DC interconnect (DCI) services, and in service provider (SP) applications for next generation virtual private LAN services. - In DC applications, a POD can consist of a collection of servers - supported by several TOR and Spine switches. This collection of - servers and switches are self contained and may have their own - control protocol for intra-POD communication and orchestration. - However, EVPN is used as way of standard inter-POD communication for - both intra-DC and inter-DC. A subnet can span across multiple PODs - and DCs. EVPN provides robust multi-tenant solution with extensive - multi-homing capabilities to stretch a subnet (e.g., VLAN) across - multiple PODs and DCs. There can be many hosts/VMs (e.g., several - hundreds) attached to a subnet that is stretched across several PODs - and DCs. + In DC applications, a point of delivery (POD) can consist of a + collection of servers supported by several top of rack (TOR) and + Spine switches. This collection of servers and switches are self + contained and may have their own control protocol for intra-POD + communication and orchestration. However, EVPN is used as way of + standard inter-POD communication for both intra-DC and inter-DC. A + subnet can span across multiple PODs and DCs. EVPN provides robust + multi-tenant solution with extensive multi-homing capabilities to + stretch a subnet (e.g., VLAN) across multiple PODs and DCs. There can + be many hosts/VMs (e.g., several hundreds) attached to a subnet that + is stretched across several PODs and DCs. These hosts/VMs express their interests in multicast groups on a given subnet/VLAN by sending IGMP membership reports (Joins) for their interested multicast group(s). Furthermore, an IGMP router - (e.g., IGMPv1) periodically sends membership queries to find out if - there are hosts on that subnet still interested in receiving - multicast traffic for that group. The IGMP/MLD Proxy solution - described in this draft has three objectives to accomplish: + periodically sends membership queries to find out if there are hosts + on that subnet still interested in receiving multicast traffic for + that group. The IGMP/MLD Proxy solution described in this draft has + three objectives to accomplish: 1) Reduce flooding of IGMP messages: just like ARP/ND suppression mechanism in EVPN to reduce the flooding of ARP messages over EVPN, it is also desired to have a mechanism to reduce the flood of IGMP messages (both Queries and Reports) in EVPN. 2) Distributed anycast multicast proxy: it is desired for the EVPN network to act as a distributed anycast multicast router with respect to IGMP/MLD proxy function for all the hosts attached to that subnet. @@ -208,63 +214,69 @@ Single-Active Redundancy Mode: When only a single PE, among all the PEs attached to an Ethernet segment, is allowed to forward traffic to/from that Ethernet segment for a given VLAN, then the Ethernet segment is defined to be operating in Single-Active redundancy mode. All-Active Redundancy Mode: When all PEs attached to an Ethernet segment are allowed to forward known unicast traffic to/from that Ethernet segment for a given VLAN, then the Ethernet segment is defined to be operating in All-Active redundancy mode. + This document also assumes familiarity with the terminology of + [RFC7432]. Though most of the place this document uses term IGMP + membership request (Joins), it MUST be considered true for MLD + membership request too. IGMPv2 corresponds to MLDv1 & IGMPv3 + corresponds to MLDv2. + 2 IGMP Proxy IGMP Proxy mechanism is used to reduce the flooding of IGMP messages over EVPN network similar to ARP proxy used in reducing the flooding of ARP messages over EVPN. It also provides triggering mechanism for the PEs to setup their underlay multicast tunnels. IGMP Proxy mechanism consist of two components: a) Proxy for IGMP Reports and b) Proxy for IGMP Queries. 2.1 Proxy Reporting When IGMP protocol is used between host/VMs and its first hop EVPN router (EVPN PE), Proxy-reporting is used by the EVPN PE to summarize (when possible) reports received from downstream hosts and propagate - it in BGP to other PEs that are interested in the info. This is done - by terminating IGMP Reports in the first hop PE, translating and - exchanging the relevant information among EVPN BGP speakers. The + it in BGP to other PEs that are interested in the information. This + is done by terminating IGMP Reports in the first hop PE, translating + and exchanging the relevant information among EVPN BGP speakers. The information is again translated back to IGMP message at the recipient EVPN speaker. Thus it helps create an IGMP overlay subnet using BGP. In order to facilitate such an overlay, this document also defines a new EVPN route type NLRI, EVPN Selective Multicast Ethernet Tag route, along with its procedures to help exchange and register IGMP multicast groups [section 5]. 2.1.1 IGMP Membership Report Advertisement in BGP When a PE wants to advertise an IGMP membership report (Join) using the BGP EVPN route, it follows the following rules: 1) When the first hop PE receives several IGMP membership reports (Joins) , belonging to the same IGMP version, from different attached hosts/VMs for the same (*,G) or (S,G), it only sends a single BGP message corresponding to the very first IGMP Join. This is because BGP is a statefull protocol and no further transmission of the same report is needed. If the IGMP Join is for (*,G), then multicast group - address along with the corresponding version flag (v1, v2, or v3) are - set. In case of IGMPv3, exclude flag also needs to be set to indicate - that no source IP address to be excluded (e.g., include all sources - "*"). If the IGMP Join is for (S,G), then besides setting multicast - group address along with the version flag v3, the source IP address - and the include/exclude flag must be set. It should be noted that - when advertising the EVPN route for (S,G), the only valid version - flag is v3 (i.e., v1 and v2 flags must be set to zero). + address along with the corresponding version flag (v2 or v3) are set. + In case of IGMPv3, exclude flag also needs to be set to indicate that + no source IP address to be excluded (e.g., include all sources "*"). + If the IGMP Join is for (S,G), then besides setting multicast group + address along with the version flag v3, the source IP address and the + include/exclude flag must be set. It should be noted that when + advertising the EVPN route for (S,G), the only valid version flag is + v3 (i.e., v1 and v2 flags must be set to zero). 2) When the first hop PE receives an IGMPv3 Join for (S,G) on a given BD, it advertises the corresponding EVPN Selective Multicast Ethernet Tag (SMET) route regardless of whether the source (S) is attached to itself or not in order to facilitate the source move in the future. 3) When the first hop PE receives an IGMP version-X Join first for (*,G) and then later it receives an IGMP version-Y Join for the same (*,G), then it will re-advertise the same EVPN SMET route with flag for version-Y set in addition to any previously-set version flag(s). @@ -276,121 +288,101 @@ (*,G) and then later it receives an IGMPv3 Join for the same multicast group address but for a specific source address S, then the PE will advertise a new EVPN SMET route with v3 flag set (and v1 and v2 reset). Include/exclude flag also need to be set accordingly. Since source IP address is used as part of BGP route key processing, it is considered as a new BGP route advertisement. 5) When a PE receives an EVPN SMET route with more than one version flag set, it will generate the corresponding IGMP report for (*,G) for each version specified in the flag field. With multiple version - flags set, there should be no source IP address in the receive EVPN - route. If there is, then an error should be logged. If v3 flag is set - (in addition to v1 or v2), then the include/exclude flag needs to - indicate "exclude". If not, then an error should be logged. The PE - MUST generate an IGMP membership report (Join) for that (*,G) and - each IGMP version in the version flag. + flags set, there MUST not be source IP address in the receive EVPN + route. If there is, then an error SHOULD be logged. If v3 flag is set + (in addition to v2), then the include/exclude flag MUST indicate + "exclude". If not, then an error SHOULD be logged. The PE MUST + generate an IGMP membership report (Join) for that (*,G) and each + IGMP version in the version flag. 6) When a PE receives a list of EVPN SMET NLRIs in its BGP update - message, each with a different source IP address and the multicast - group address, and the version flag is set to v3, then the PE - generates an IGMPv3 membership report with a record corresponding to - the list of source IP addresses and the group address along with the - proper indication of inclusion/exclusion. + message, each with a different source IP address and the same + multicast group address, and the version flag is set to v3, then the + PE generates an IGMPv3 membership report with a record corresponding + to the list of source IP addresses and the group address along with + the proper indication of inclusion/exclusion. 7) Upon receiving EVPN SMET route(s) and before generating the corresponding IGMP Join(s), the PE checks to see whether it has any CE multicast router for that BD on any of its ES's . The PE provides such check by listening for PIM hellos on that AC (i.e, ). If it has router's ACs, then the generated IGMP Join(s) are sent to those ACs. If it doesn't have any router's AC, then no IGMP Join(s) needs to be generated because sending IGMP Joins to other hosts can result in unintentionally preventing a host from joining a specific - multicast group for IGMPv1 and IGMPv2 - i.e., if the PE does not - receive a join from the host it will not forward multicast data to - it. Per [RFC4541], when an IGMPv1 or IGMPv2 host receives a - membership report for a group address that it intends to join, the - host will suppress its own membership report for the same group. In - other words, an IGMPv1 or IGMPv2 Join MUST NOT be sent on an AC that - does not lead to a CE multicast router. This message suppression is a - requirement for IGMPv1 and IGMPv2 hosts. This is not a problem for - hosts running IGMPv3 because there is no suppression of IGMP - Membership reports. + multicast group for IGMPv2 - i.e., if the PE does not receive a join + from the host it will not forward multicast data to it. Per + [RFC4541], when an IGMPv2 host receives a membership report for a + group address that it intends to join, the host will suppress its own + membership report for the same group. In other words, an IGMPv2 Join + MUST NOT be sent on an AC that does not lead to a CE multicast + router. This message suppression is a requirement for IGMPv2 hosts. + This is not a problem for hosts running IGMPv3 because there is no + suppression of IGMP Membership reports. 2.1.1 IGMP Leave Group Advertisement in BGP When a PE wants to withdraw an EVPN SMET route corresponding to an IGMPv2 Leave Group (Leave) or IGMPv3 "Leave" equivalent message, it follows the following rules: - 1) For IGMPv1, there is no explicit membership leave; therefore, the - PE needs to periodically send out an IGMP membership query to - determine whether there is any host left who is interested in - receiving traffic directed to this multicast group (this proxy query - function will be described in more details in section 2.2). If there - is no host left, then the PE re-advertises EVPN SMET route with the - v1 version flag reset. If this is the last version flag to be reset, - then instead of re-advertising the EVPN route with all version flags - reset, the PE withdraws the EVPN route for that (*,G). - - 2) When a PE receives an IGMPv2 Leave Group or its "Leave" equivalent + 1) When a PE receives an IGMPv2 Leave Group or its "Leave" equivalent message for IGMPv3 from its attached host, it checks to see if this host is the last host who is interested in this multicast group by sending a query for the multicast group. If the host was indeed the last one, then the PE re-advertises EVPN SMET Multicast route with the corresponding version flag reset. If this is the last version flag to be reset, then instead of re-advertising the EVPN route with all version flags reset, the PE withdraws the EVPN route for that (*,G). - 3) When a PE receives an EVPN SMET route for a given (*,G), it + 2) When a PE receives an EVPN SMET route for a given (*,G), it compares the received version flags from the route with its per-PE stored version flags. If the PE finds that a version flag associated with the (*,G) for the remote PE is reset, then the PE generates IGMP Leave for that (*,G) toward its local interface (if any) attached to the multicast router for that multicast group. It should be noted that the received EVPN route should at least have one version flag set. If all version flags are reset, it is an error because the PE should have received an EVPN route withdraw for the last version flag. If the PE receives an EVPN SMET route withdraw, then it must remove the remote PE from the OIF list associated with that multicast group. - 4) When a PE receives an EVPN SMET route withdraw, it removes the + 3) When a PE receives an EVPN SMET route withdraw, it removes the remote PE from its OIF list for that multicast group and if there are no more OIF entries for that multicast group (either locally or remotely), then the PE MUST stop responding to queries from the locally attached router (if any). If there is a source for that multicast group, the PE stops sending multicast traffic for that source. 2.2 Proxy Querier As mentioned in the previous sections, each PE need to have proxy querier functionality for the following reasons: 1) To enable the collection of EVPN PEs providing L2VPN service to act as distributed multicast router with Anycast IP address for all attached hosts/VMs in that subnet. 2) To enable suppression of IGMP membership reports and queries over MPLS/IP core. - 3) To enable generation of query messages locally to their attached - host. In case of IGMPv1, the PE needs to send out an IGMP membership - query to verify that at least one host on the subnet is still - interested in receiving traffic directed to that group. When there is - no reply to three consecutive IGMP membership queries, the PE times - out the group, stops forwarding multicast traffic to the attached - hosts for that (*,G), and sends a EVPN SMET route associated with - that (*,G) with the version-1 flag reset or withdraws that route. - 3 Operation Consider the EVPN network of figure-1, where there is an EVPN instance configured across the PEs shown in this figure (namely PE1, PE2, and PE3). Lets consider that this EVPN instance consist of a single bridge domain (single subnet) with all the hosts, sources and the multicast router shown in this figure connected to this subnet. PE1 only has hosts connected to it. PE2 has a mix of hosts and multicast source. PE3 has a mix of hosts, multicast source, and multicast router. Further more, lets consider that for (S1,G1), R1 is @@ -399,71 +391,71 @@ locally attached devices for that subnet are: - only hosts/VMs - mix of hosts/VMs and multicast source - mix of hosts/VMs, multicast source, and multicast router +--------------+ | | | | +----+ | | +----+ - H1:(*,G1)v1 ---| | | | | |---- H6(*,G1)v2 - H2:(*,G1)v1 ---| PE1| | IP/MPLS | | PE2|---- H7(S2,G2)v3 - H3:(*,G1)v2 ---| | | Network | | |---- S2 + H1:(*,G1)v2 ---| | | | | |---- H6(*,G1)v2 + H2:(*,G1)v2 ---| PE1| | IP/MPLS | | PE2|---- H7(S2,G2)v3 + H3:(*,G1)v3 ---| | | Network | | |---- S2 H4:(S2,G2)v3 --| | | | | | +----+ | | +----+ | | +----+ | | H5:(S1,G1)v3 --| | | | S1 ---| PE3| | | R1 ---| | | | +----+ | | | | +--------------+ Figure 1: 3.1 PE with only attached hosts/VMs for a given subnet - When PE1 receives an IGMPv1 Join Report from H1, it does not forward + When PE1 receives an IGMPv2 Join Report from H1, it does not forward this join to any of its other ports (for this subnet) because all these local ports are associated with the hosts/VMs. PE1 sends an EVPN Multicast Group route corresponding to this join for (*,G1) and - setting v1 flag. This EVPN route is received by PE2 and PE3 that are + setting v2 flag. This EVPN route is received by PE2 and PE3 that are the member of the same BD (i.e., same EVI in case of VLAN-based service or in case of VLAN-aware bundle service). PE3 - reconstructs IGMPv1 Join Report from this EVPN BGP route and only + reconstructs IGMPv2 Join Report from this EVPN BGP route and only sends it to the port(s) with multicast routers attached to it (for - that subnet). In this example, PE3 sends the reconstructed IGMPv1 + that subnet). In this example, PE3 sends the reconstructed IGMPv2 Join Report for (*,G1) to only R1. Furthermore, PE2 although receives the EVPN BGP route, it does not send it to any of its port for that subnet - namely ports associated with H6 and H7. - When PE1 receives the second IGMPv1 Join from H2 for the same + When PE1 receives the second IGMPv2 Join from H2 for the same multicast group (*,G1), it only adds that port to its OIF list but it doesn't send any EVPN BGP route because there is no change in - information. However, when it receives the IGMPv2 Join from H3 for + information. However, when it receives the IGMPv3 Join from H3 for the same (*,G1), besides adding the corresponding port to its OIF list, it re-advertises the previously sent EVPN SMET route with the - version-2 flag set. + version-3 & exclude flag set. Finally when PE1 receives the IMGMPv3 Join from H4 for (S2,G2), it advertises a new EVPN SMET route corresponding to it. 3.2 PE with mixed of attached hosts/VMs and multicast source The main difference in here is that when PE2 receives IGMPv3 Join - from H7 for (S2,G2), it does not advertises it in BGP because PE2 - knows that S2 is attached to its local AC. PE2 adds the port - associated with H7 to its OIF list for (S2,G2). The processing for - IGMPv2 received from H6 is the same as the v2 Join described in - previous section. + from H7 for (S2,G2), it does advertises it in BGP to support source + move even though PE2 knows that S2 is attached to its local AC. PE2 + adds the port associated with H7 to its OIF list for (S2,G2). The + processing for IGMPv2 received from H6 is the same as the v2 Join + described in previous section. 3.3 PE with mixed of attached hosts/VMs, multicast source and router The main difference in here relative to the previous two sections is that Join messages received locally needs to be sent to the port associated with router R1. Furthermore, the Joins received via BGP need to be passed to the R1 port but filtered for all other ports. 4 All-Active Multi-Homing @@ -593,27 +585,38 @@ route for that (x, G) group in that BD, it does so now. When the Maximum Response Timer expires a PE that has advertised an IGMP Leave Synch route, withdraws it. Any PE attached to the multihomed ES, that started the Maximum Response Time and has no local IGMP Join (x, G) state and no installed IGMP Join Synch routes, it removes IGMP Join (x, G) state for that [ES, BD]. If the DF no longer has IGMP Join (x, G) state for that BD on any ES for which it is DF, it withdraws its SMET route for that (x, G) group in that BD. +4.3 Mass Withdraw of Multicast join Sync route in case of failure + + A PE which has received IGMP join, would have synced IGMP join by + procedure section(4.1). If PE with local join state goes down or PE + to CE link goes down, it would lead to mass withdraw of multicast + routes. Remote PE (PE where these routes were remote IGMP join) + SHOULD not remove the state immediately where as General Query SHOULD + be generated to refresh the states. Some of the way (But not limited + to) to detect failure at peer could be IGP next hop tracking or ES + route withdraw. + 5 Single-Active Multi-Homing Note that to facilitate state synchronization after failover, the PEs - attached to a multihomed ES operating in Single-Active redundancy - mode should also coordinate IGMP Join (x, G) state. In this case all - IGMP Join messages are received by the DF and distributed to the non- - DF PEs using the procedures described above. + attached to a mutihomed ES operating in Single-Active redundancy mode + should also coordinate IGMP Join (x, G) state. In this case all IGMP + Join messages are received by the DF and distributed to the non-DF + PEs using the procedures described above. 6 Selective Multicast Procedures for IR tunnels If an ingress PE uses ingress replication, then for a given (x, G) group in a given BD: 1) It sends (x, G) traffic to the set of PEs not supporting IGMP Proxy. This set consists of any PE that has advertised an Inclusive Multicast Tag route for the BD without the "IGMP Proxy Support" flag. @@ -630,22 +633,22 @@ will include those PEs that have advertised an SMET route for that (x, G) group on that BD (for Selective P-tunnel) but it may include other PEs as well (for Aggregate Selective P-tunnel). 7 BGP Encoding This document defines three new BGP EVPN routes to carry IGMP membership reports. This route type is known as: + 6 - Selective Multicast Ethernet Tag Route - + 7 - IGMP Join Synch Route - + 8 - IGMP Leave Synch Route + + 7 - Multicast Join Synch Route + + 8 - Multicast Leave Synch Route The detailed encoding and procedures for this route type is described in subsequent section. 7.1 Selective Multicast Ethernet Tag Route An Selective Multicast Ethernet Tag route type specific EVPN NLRI consists of the following: +---------------------------------------+ @@ -694,47 +697,60 @@ This EVPN route type is used to carry tenant IGMP multicast group information. The flag field assists in distributing IGMP membership interest of a given host/VM for a given multicast route. The version bits help associate IGMP version of receivers participating within the EVPN domain. The include/exclude bit helps in creating filters for a given multicast route. + If route is being prepared for IPv6 (MLD) then bit 7 indicates + support for MLD version 1. The second least significant bit, bit 6 + indicates support for MLD version 2. Since there is no MLD version 3, + in case of IPv6 route third least significant bit MUST be 0. In case + of IPv6 route, the fourth least significant bit MUST be ignored if + bit 6 is not set. + 7.1.1 Constructing the Selective Multicast Ethernet Tag route This section describes the procedures used to construct the Selective Multicast Ethernet Tag (SMET) route. Support for this route type is optional. The Route Distinguisher (RD) SHOULD be a Type 1 RD [RFC4364]. The value field comprises an IP address of the PE (typically, the loopback address) followed by a number unique to the PE. The Ethernet Tag ID MUST be set as follows: EVI is VLAN-Based or VLAN Bundle service - set to 0 EVI is VLAN-Aware Bundle service without translation - set to the customer VID for that BD EVI is VLAN-Aware Bundle service with translation - set to the normalized Ethernet Tag ID - e.g., normalized VID - The Multicast Source length MUST be set to length of multicast source - address in bits. In case of a (*, G) Join, the Multicast Source - Length is set to 0. + The Multicast Source length MUST be set to length of Multicast Source + address in bits. If the Multicast Source field contains an IPv4 + address, then the value of the Multicast Source Length field is 32. + If the Multicast Source field contains an IPv6 address, then the + value of the Multicast Source Length field is 128. In case of a (*, + G) Join, the Multicast Source Length is set to 0. The Multicast Source is the Source IP address of the IGMP membership report. In case of a (*, G) Join, this field does not exist. The Multicast Group length MUST be set to length of multicast group - address in bits. + address in bits. If the Multicast Group field contains an IPv4 + address, then the value of the Multicast Group Length field is 32. + If the Multicast Group field contains an IPv6 address, then the value + of the Multicast Group Length field is 128. The Multicast Group is the Group address of the IGMP membership report. The Originator Router Length is the length of the Originator Router address in bits. The Originator Router Address is the IP address of Router Originating the prefix. It should be noted that using the "Originating Router's IP address" field is needed for local-bias procedures and may be @@ -746,21 +762,56 @@ multicast group had INCLUDE or EXCLUDE bit set. IGMP protocol is used to receive group membership information from hosts/VMs by TORs. Upon receiving the hosts/VMs expression of interest of a particular group membership, this information is then forwarded using Ethernet Multicast Source Group Route NLRI. The NLRI also keeps track of receiver's IGMP protocol version and any "source filtering" for a given group membership. All EVPN SMET routes are announced with per-EVI Route Target extended communities. -7.2 IGMP Join Synch Route +7.1.2 Default Selective Multicast Route + + If there is multicast router connected behind EVPN domain, PE MAY + originate default SMET (*,*) to get all multicast traffic in domain. + For example, + +--------------+ + | | + | | + | | +----+ + | | | |---- H1(*,G1)v2 + | IP/MPLS | | PE1|---- H2(S2,G2)v3 + | Network | | |---- S2 + | | | | + | | +----+ + | | + +----+ | | + +----+ | | | | + | | S1 ---| PE2| | | + |PIM |----R1 ---| | | | + |ASM | +----+ | | + | | | | + +----+ +--------------+ + + Figure 2: + + Consider the EVPN network of figure-2, where there is an EVPN + instance configured across the PEs shows in this figure. Lets + consider PE2 is connected to multicast router R1 and there is PIM ASM + network behind R1. If there are receivers behind PIM ASM network, PIM + join would be forwarded to PIM RP (Rendezvous Point). If receivers + behind PIM ASM network are interested in multicast flow originated by + multicast source S2 (Behind PE1), it is necessary for PE2 to receive + multicast traffic. In this case PE2 MUST originate (*,*) SMET route + to receive all of the multicast traffic in EVPN domain. + +7.2 Multicast Join Synch Route This EVPN route type is used to coordinate IGMP Join (x,G) state for a given BD between the PEs attached to a given ES operating in All- Active (or Single-Active) redundancy mode and it consists of following: +--------------------------------------------------+ | RD (8 octets) | +--------------------------------------------------+ | Ethernet Segment Identifier (10 octets) | @@ -799,21 +850,28 @@ type is of Include Group type (bit value 0) or an Exclude Group type (bit value 1). The Exclude Group type bit MUST be ignored if bit 5 is not set. The Flags field assists in distributing IGMP membership interest of a given host/VM for a given multicast route. The version bits help associate IGMP version of receivers participating within the EVPN domain. The include/exclude bit helps in creating filters for a given multicast route. -7.2.1 Constructing the IGMP Join Synch Route + If route is being prepared for IPv6 (MLD) then bit 7 indicates + support for MLD version 1. The second least significant bit, bit 6 + indicates support for MLD version 2. Since there is no MLD version 3, + in case of IPv6 route third least significant bit MUST be 0. In case + of IPv6 route, the fourth least significant bit MUST be ignored if + bit 6 is not set. + +7.2.1 Constructing the Multicast Join Synch Route This section describes the procedures used to construct the IGMP Join Synch route. Support for this route type is optional. If a PE does not support this route, then it MUST not indicate that it supports 'IGMP proxy' in Multicast Flag extended community for the EVIs corresponding to its multi-homed Ethernet Segments. An IGMP Join Synch route MUST carry exactly one ES-Import Route Target extended community, the one that corresponds to the ES on which the IGMP Join was received. It MUST also carry exactly one @@ -829,47 +887,55 @@ value defined for the ES. The Ethernet Tag ID MUST be set as follows: EVI is VLAN-Based or VLAN Bundle service - set to 0 EVI is VLAN-Aware Bundle service without translation - set to the customer VID for the BD EVI is VLAN-Aware Bundle service with translation - set to the normalized Ethernet Tag ID - e.g., normalized VID - The Multicast Source length MUST be set to length of multicast source - address in bits. In case of a (*, G) Join, the Multicast Source - Length is set to 0. + The Multicast Source length MUST be set to length of Multicast Source + address in bits. If the Multicast Source field contains an IPv4 + address, then the value of the Multicast Source Length field is 32. + If the Multicast Source field contains an IPv6 address, then the + value of the Multicast Source Length field is 128. In case of a (*, + G) Join, the Multicast Source Length is set to 0. The Multicast Source is the Source IP address of the IGMP membership report. In case of a (*, G) Join, this field does not exist. The Multicast Group length MUST be set to length of multicast group - address in bits. + address in bits. If the Multicast Group field contains an IPv4 + address, then the value of the Multicast Group Length field is 32. + If the Multicast Group field contains an IPv6 address, then the value + of the Multicast Group Length field is 128. The Multicast Group is the Group address of the IGMP membership report. The Originator Router Length is the length of the Originator Router address in bits. The Originator Router Address is the IP address of Router Originating the prefix. The Flags field indicates the version of IGMP protocol from which the membership report was received. It also indicates whether the multicast group had INCLUDE or EXCLUDE bit set. -7.3 IGMP Leave Synch Route This EVPN route type is used to coordinate - IGMP Leave Group (x,G) state for a given BD between the PEs attached - to a given ES operating in All-Active (or Single-Active) redundancy - mode and it consists of following: +7.3 Multicast Leave Synch Route + + This EVPN route type is used to coordinate IGMP Leave Group (x,G) + state for a given BD between the PEs attached to a given ES operating + in All-Active (or Single-Active) redundancy mode and it consists of + following: +--------------------------------------------------+ | RD (8 octets) | +--------------------------------------------------+ | Ethernet Segment Identifier (10 octets) | +--------------------------------------------------+ | Ethernet Tag ID (4 octets) | +--------------------------------------------------+ | Multicast Source Length (1 octet) | +--------------------------------------------------+ @@ -908,21 +974,28 @@ type is of Include Group type (bit value 0) or an Exclude Group type (bit value 1). The Exclude Group type bit MUST be ignored if bit 5 is not set. The Flags field assists in distributing IGMP membership interest of a given host/VM for a given multicast route. The version bits help associate IGMP version of receivers participating within the EVPN domain. The include/exclude bit helps in creating filters for a given multicast route. -7.3.1 Constructing the IGMP Leave Synch Route + If route is being prepared for IPv6 (MLD) then bit 7 indicates + support for MLD version 1. The second least significant bit, bit 6 + indicates support for MLD version 2. Since there is no MLD version 3, + in case of IPv6 route third least significant bit MUST be 0. In case + of IPv6 route, the fourth least significant bit MUST be ignored if + bit 6 is not set. + +7.3.1 Constructing the Multicas Leave Synch Route This section describes the procedures used to construct the IGMP Leave Synch route. Support for this route type is optional. If a PE does not support this route, then it MUST not indicate that it supports 'IGMP proxy' in Multicast Flag extended community for the EVIs corresponding to its multi-homed Ethernet Segments. An IGMP Leave Synch route MUST carry exactly one ES-Import Route Target extended community, the one that corresponds to the ES on which the IGMP Leave was received. It MUST also carry exactly one @@ -939,28 +1012,34 @@ The Ethernet Tag ID MUST be set as follows: EVI is VLAN-Based or VLAN Bundle service - set to 0 EVI is VLAN-Aware Bundle service without translation - set to the customer VID for the BD EVI is VLAN-Aware Bundle service with translation - set to the normalized Ethernet Tag ID - e.g., normalized VID The Multicast Source length MUST be set to length of multicast source - address in bits. In case of a (*, G) Join, the Multicast Source - Length is set to 0. + address in bits. If the Multicast Source field contains an IPv4 + address, then the value of the Multicast Source Length field is 32. + If the Multicast Source field contains an IPv6 address, then the + value of the Multicast Source Length field is 128. In case of a (*, + G) Join, the Multicast Source Length is set to 0. The Multicast Source is the Source IP address of the IGMP membership report. In case of a (*, G) Join, this field does not exist. The Multicast Group length MUST be set to length of multicast group - address in bits. + address in bits. If the Multicast Group field contains an IPv4 + address, then the value of the Multicast Group Length field is 32. + If the Multicast Group field contains an IPv6 address, then the value + of the Multicast Group Length field is 128. The Multicast Group is the Group address of the IGMP membership report. The Originator Router Length is the length of the Originator Router address in bits. The Originator Router Address is the IP address of Router Originating the prefix. @@ -1002,21 +1081,21 @@ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type=0x06 | Sub-Type=TBD | Flags (2 Octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved=0 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The low-order bit of the Flags is defined as the "IGMP Proxy Support" bit. A value of 1 means that the PE supports IGMP Proxy as defined in this document, and a value of 0 means that the PE does not support IGMP proxy. The absence of this extended community also means that - the PE doesn not support IGMP proxy. + the PE does not support IGMP proxy. 7.5 EVI-RT Extended Community In EVPN, every EVI is associated with one or more Route Targets (RTs). These Route Targets serve two functions: - Distribution control: RTs control the distribution of the routes. If a route carries the RT associated with a particular EVI, it will be distributed to all the PEs on which that EVI exists. @@ -1108,27 +1187,44 @@ particular EVI may be different in each AS. If a route is propagated from AS1 to AS2, an ASBR at the AS1/AS2 border may be provisioned with a policy that removes the RTs that are meaningful in AS1 and replaces them with the corresponding (i.e., RTs corresponding to the same EVIs) RTs that are meaningful in AS2. This is known as RT- rewriting. Note that if a given route's RTs are rewritten, and the route carries an EVI-RT EC, the EVI-RT EC needs to be rewritten as well. -8 Acknowledgement +8 IGMP/MLD Immediate leave -9 Security Considerations + IGMP MAY be configured with immediate leave option. This allows the + device to remove the group entry from the multicast routing table + immediately upon receiving a IGMP leave message for (x,G). In case of + all active multi-homing while synchronizing IGMP leave state to + redundancy peers, Maximum Response Time MAY be filled as Zero. + Implementation SHOULD make sure to have identical configuration + across multi home peer. In case IGMP leave Synch route is received + with Maximum Response Time Zero, irrespective of local IGMP + configuration it MAY be processed as immediate leave. + +9 IGMP Version 1 membership request + + This document does not provide any detail about IGMPv1 processing. + Multicast working group are in process of absoluting uses of IGMPv1 + so implementation are RECOMENDED to use IGMPv2 / MLDv1 and above + only. + +10 Security Considerations Same security considerations as [RFC7432]. -10 IANA Considerations +11 IANA Considerations IANA has allocated the following codepoints from the EVPN Extended Community sub-types registry. 0x09 Multicast Flags Extended Community [this document] 0x0A EVI-RT Type 0 [this document] 0x0B EVI-RT Type 1 [this document] 0x0C EVI-RT Type 2 [this document] IANA is requested to allocate a new codepoint from the EVPN Extended @@ -1149,44 +1245,60 @@ The Multicast Flags Extended Community contains a 16-bit Flags field. The bits are numbered 0-15, from low-order to high-order. The registry should be initialized as follows: 0 : IGMP Proxy Support [this document] 1-15 : unassigned The registration policy should be "Standards Action". -11 References +12 References -11.1 Normative References +12.1 Normative References [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, March 1997. + Requirement Levels", BCP 14, RFC 2119, DOI + 10.17487/RFC2119, March 1997, . [RFC4360] S. Sangli et al, ""BGP Extended Communities Attribute", February, 2006. [RFC7432] Sajassi et al., "BGP MPLS Based Ethernet VPN", February, 2015. -11.2 Informative References +12.2 Informative References - [ETREE-FMWK] Key et al., "A Framework for E-Tree Service over MPLS - Network", draft-ietf-l2vpn-etree-frwk-03, work in progress, September - 2013. + [RFC7387] Key, et al., ""A Framework for Ethernet Tree (E-Tree) + Service over a Multiprotocol Label Switching (MPLS) Network", October + 2014. - [PBB-EVPN] Sajassi et al., "PBB-EVPN", draft-ietf-l2vpn-pbb-evpn- - 05.txt, work in progress, October, 2013. + [RFC7623] Sajassi, et al., ""Provider Backbone Bridging Combined with + Ethernet VPN (PBB-EVPN)", September 2015. - [RFC4541] Christensen, M., Kimball, K., and F. Solensky, - "Considerations for IGMP and MLD snooping PEs", RFC 4541, 2006. + [FC4541] Christensen, M., Kimball, K., and F. Solensky, + "Considerations for IGMP and MLD snooping PEs", 2006. + + [RFC3376] Cain, et. al., "Internet Group Management Protocol, Version + 3", October 2002. + + [RFC3810] Vida & Costa , "Multicast Listener Discovery Version 2 + (MLDv2) for IPv6", June 2004 + +13 Acknowledgement + +14 Contributors + + Mankamana Mishra + Cisco systems + Email: mankamis@cisco.com Authors' Addresses Ali Sajassi Cisco Email: sajassi@cisco.com Samir Thoria Cisco Email: sthoria@cisco.com