--- 1/draft-ietf-mboned-dc-deploy-08.txt 2020-02-04 12:27:56.918951497 -0800 +++ 2/draft-ietf-mboned-dc-deploy-09.txt 2020-02-04 12:27:56.962952611 -0800 @@ -1,27 +1,27 @@ MBONED M. McBride Internet-Draft Futurewei Intended status: Informational O. Komolafe Expires: August 7, 2020 Arista Networks February 4, 2020 Multicast in the Data Center Overview - draft-ietf-mboned-dc-deploy-08 + draft-ietf-mboned-dc-deploy-09 Abstract The volume and importance of one-to-many traffic patterns in data centers is likely to increase significantly in the future. Reasons for this increase are discussed and then attention is paid to the - manner in which this traffic pattern may be judiously handled in data - centers. The intuitive solution of deploying conventional IP + manner in which this traffic pattern may be judiciously handled in + data centers. The intuitive solution of deploying conventional IP multicast within data centers is explored and evaluated. Thereafter, a number of emerging innovative approaches are described before a number of recommendations are made. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering @@ -76,35 +76,34 @@ 7. Security Considerations . . . . . . . . . . . . . . . . . . . 15 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 9.1. Normative References . . . . . . . . . . . . . . . . . . 15 9.2. Informative References . . . . . . . . . . . . . . . . . 16 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 1. Introduction The volume and importance of one-to-many traffic patterns in data - centers is likely to increase significantly in the future. Reasons - for this increase include the nature of the traffic generated by - applications hosted in the data center, the need to handle broadcast, - unknown unicast and multicast (BUM) traffic within the overlay - technologies used to support multi-tenancy at scale, and the use of - certain protocols that traditionally require one-to-many control - message exchanges. + centers will likely continue to increase. Reasons for this increase + include the nature of the traffic generated by applications hosted in + the data center, the need to handle broadcast, unknown unicast and + multicast (BUM) traffic within the overlay technologies used to + support multi-tenancy at scale, and the use of certain protocols that + traditionally require one-to-many control message exchanges. - These trends, allied with the expectation that future highly - virtualized large-scale data centers must support communication - between potentially thousands of participants, may lead to the - natural assumption that IP multicast will be widely used in data - centers, specifically given the bandwidth savings it potentially - offers. However, such an assumption would be wrong. In fact, there - is widespread reluctance to enable conventional IP multicast in data + These trends, allied with the expectation that highly virtualized + large-scale data centers must support communication between + potentially thousands of participants, may lead to the natural + assumption that IP multicast will be widely used in data centers, + specifically given the bandwidth savings it potentially offers. + However, such an assumption would be wrong. In fact, there is + widespread reluctance to enable conventional IP multicast in data centers for a number of reasons, mostly pertaining to concerns about its scalability and reliability. This draft discusses some of the main drivers for the increasing volume and importance of one-to-many traffic patterns in data centers. Thereafter, the manner in which conventional IP multicast may be used to handle this traffic pattern is discussed and some of the associated challenges highlighted. Following this discussion, a number of alternative emerging approaches are introduced, before concluding by discussing key trends and making a number of @@ -217,23 +216,23 @@ Another key contributor to the rise in one-to-many traffic patterns is the proposed architecture for supporting large-scale multi-tenancy in highly virtualized data centers [RFC8014]. In this architecture, a tenant's VMs are distributed across the data center and are connected by a virtual network known as the overlay network. A number of different technologies have been proposed for realizing the overlay network, including VXLAN [RFC7348], VXLAN-GPE [I-D.ietf-nvo3- vxlan-gpe], NVGRE [RFC7637] and GENEVE [I-D.ietf-nvo3-geneve]. The often fervent and arguably partisan debate about the relative merits of these overlay technologies belies the fact that, conceptually, it - may be said that these overlays mainly simply provide a means to - encapsulate and tunnel Ethernet frames from the VMs over the data - center IP fabric, thus emulating a Layer 2 segment between the VMs. + may be said that these overlays simply provide a means to encapsulate + and tunnel Ethernet frames from the VMs over the data center IP + fabric, thus emulating a Layer 2 segment between the VMs. Consequently, the VMs believe and behave as if they are connected to the tenant's other VMs by a conventional Layer 2 segment, regardless of their physical location within the data center. Naturally, in a Layer 2 segment, point to multi-point traffic can result from handling BUM (broadcast, unknown unicast and multicast) traffic. And, compounding this issue within data centers, since the tenant's VMs attached to the emulated segment may be dispersed throughout the data center, the BUM traffic may need to traverse the data center fabric. @@ -278,27 +277,27 @@ Section 2.1, Section 2.2 and Section 2.3 have discussed how the trends in the types of applications, the overlay technologies used and some of the essential networking protocols results in an increase in the volume of one-to-many traffic patterns in modern highly- virtualized data centers. Section 3 explores how such traffic flows may be handled using conventional IP multicast. 3. Handling one-to-many traffic using conventional multicast - Faced with ever increasing volumes of one-to-many traffic flows for - the reasons presented in Section 2, arguably the intuitive initial - course of action for a data center operator is to explore if and how - conventional IP multicast could be deployed within the data center. - This section introduces the key protocols, discusses some example use - cases where they are deployed in data centers and discusses some of - the advantages and disadvantages of such deployments. + Faced with ever increasing volumes of one-to-many traffic flows, for + the reasons presented in Section 2, it makes sense for a data center + operator to explore if and how conventional IP multicast could be + deployed within the data center. This section introduces the key + protocols, discusses some example use cases where they are deployed + in data centers and discusses some of the advantages and + disadvantages of such deployments. 3.1. Layer 3 multicast PIM is the most widely deployed multicast routing protocol and so, unsurprisingly, is the primary multicast routing protocol considered for use in the data center. There are three potential popular modes of PIM that may be used: PIM-SM [RFC4601], PIM-SSM [RFC4607] or PIM- BIDIR [RFC5015]. It may be said that these different modes of PIM tradeoff the optimality of the multicast forwarding tree for the amount of multicast forwarding state that must be maintained at @@ -389,23 +388,23 @@ specification [RFC7348], a data-driven flood and learn control plane was proposed, requiring the data center IP fabric to support multicast routing. A multicast group is associated with each virtual network, each uniquely identified by its VXLAN network identifiers (VNI). VXLAN tunnel endpoints (VTEPs), typically located in the hypervisor or ToR switch, with local VMs that belong to this VNI would join the multicast group and use it for the exchange of BUM traffic with the other VTEPs. Essentially, the VTEP would encapsulate any BUM traffic from attached VMs in an IP multicast packet, whose destination address is the associated multicast group - address, and transmit the packet to the data center fabric. Thus, - PIM must be running in the fabric to maintain a multicast - distribution tree per VNI. + address, and transmit the packet to the data center fabric. Thus, a + multicast routing protocol (typically PIM) must be running in the + fabric to maintain a multicast distribution tree per VNI. Alternatively, rather than setting up a multicast distribution tree per VNI, a tree can be set up whenever hosts within the VNI wish to exchange multicast traffic. For example, whenever a VTEP receives an IGMP report from a locally connected host, it would translate this into a PIM join message which will be propagated into the IP fabric. In order to ensure this join message is sent to the IP fabric rather than over the VXLAN interface (since the VTEP will have a route back to the source of the multicast packet over the VXLAN interface and so would naturally attempt to send the join over this interface) a more @@ -619,30 +618,30 @@ to the packet. This header contains a bit string in which each bit maps to an egress router, known as Bit-Forwarding Egress Router (BFER). If a bit is set, then the packet should be forwarded to the associated BFER. The routers within the BIER domain, Bit-Forwarding Routers (BFRs), use the BIER header in the packet and information in the Bit Index Forwarding Table (BIFT) to carry out simple bit- wise operations to determine how the packet should be replicated optimally so it reaches all the appropriate BFERs. BIER is deemed to be attractive for facilitating one-to-many - communications in data centers [I-D.ietf-bier-use-cases]. The - deployment envisioned with overlay networks is that the the - encapsulation endpoints would be the BFIR. So knowledge about the - actual multicast groups does not reside in the data center fabric, - improving the scalability compared to conventional IP multicast. - Additionally, a centralized controller or a BGP-EVPN control plane - may be used with BIER to ensure the BFIR have the required - information. A challenge associated with using BIER is that it - requires changes to the forwarding behaviour of the routers used in - the data center IP fabric. + communications in data centers [I-D.ietf-bier-use-cases]. The BFIRs + are the encapsulation endpoints in the deployment envisioned with + overlay networks. So knowledge about the actual multicast groups + does not reside in the data center fabric, improving the scalability + compared to conventional IP multicast. Additionally, a centralized + controller or a BGP-EVPN control plane may be used with BIER to + ensure the BFIR have the required information. A challenge + associated with using BIER is that it requires changes to the + forwarding behaviour of the routers used in the data center IP + fabric. 4.5. Segment Routing Segment Routing (SR) [RFC8402] is a manifestation of the source routing paradigm, so called as the path a packet takes through a network is determined at the source. The source encodes this information in the packet header as a sequence of instructions. These instructions are followed by intermediate routers, ultimately resulting in the delivery of the packet to the desired destination. In SR, the instructions are known as segments and a number of