draft-ietf-mboned-anycast-rp-01.txt   draft-ietf-mboned-anycast-rp-02.txt 
MBONED Working Group Dorian Kim MBONED Working Group Dorian Kim
Internet Draft Verio Internet Draft Verio
David Meyer David Meyer
Cisco Systems Cisco Systems
Henry Kilmer Henry Kilmer
Dino Farinacci Dino Farinacci
Procket Networks
Category Informational Category Informational
November, 1999 November, 1999
Anycast RP mechanism using PIM and MSDP Anycast RP mechanism using PIM and MSDP
<draft-ietf-mboned-anycast-rp-01.txt> <draft-ietf-mboned-anycast-rp-02.txt>
1. Status of this Memo 1. Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
2026 are working documents of the Internet Engineering Task Force Internet-Drafts are working documents of the Internet Engineering
(IETF), its areas, and its working groups. Note that other groups Task Force (IETF), its areas, and its working groups. Note that
may also distribute working documents as Internet- Drafts. other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet- Drafts as reference time. It is inappropriate to use Internet- Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
2. Abstract 2. Abstract
This document describes a mechanism to allow for an arbitrary number This document describes a mechanism to allow for an arbitrary number
of RPs per group in a single share-tree PIM-SM domain. of RPs per group in a single shared-tree PIM-SM domain.
This memo is a product of the MBONE Deployment Working Group (MBONED) This memo is a product of the MBONE Deployment Working Group (MBONED)
in the Operations and Management Area of the Internet Engineering in the Operations and Management Area of the Internet Engineering
Task Force. Submit comments to <mboned@ns.uoregon.edu> or the Task Force. Submit comments to <mboned@ns.uoregon.edu> or the
authors. authors.
3. Copyright Notice 3. Copyright Notice
Copyright (C) The Internet Society (1999). All Rights Reserved. Copyright (C) The Internet Society (1999). All Rights Reserved.
4. Introduction 4. Introduction
PIM-SM as currently defined allows for only a single active RP per PIM-SM as defined in RFC 2352 allows for only a single active RP per
group, and as such the decision of optimal RP placement can become group, and as such the decision of optimal RP placement can become
problematic for a multi-regional network deploying PIM-SM. problematic for a multi-regional network deploying PIM-SM.
The single active RP, or flat RP space design of PIM-SM has several Anycast RP relaxes an important constraint in PIM-SM, namely, that
implications, including traffic concentration, lack of scalable load there can be only one group to RP mapping active at any time. The
balancing and redundancy between RPs, sub-optimal forwarding of single mapping property has several implications, including traffic
multicast packets, and distant RP dependencies. These properties of concentration, lack of scalable register decapsulation (when using
PIM-SM have been demonstrated in recent native continental or inter- the shared tree), slow convergence when an active RP fails, possible
continental scale multicast deployments. As a result, it became clear sub-optimal forwarding of multicast packets, and distant RP
that ISP backbones require a mechanism that allows definition of dependencies. These properties of PIM-SM have been demonstrated in
multiple active RPs per group in single PIM-SM domain. Further, any native continental or inter-continental scale multicast deployments.
such mechanism should also addresses the issues addressed above. As a result, it is clear that ISP backbones require a mechanism that
allows definition of multiple active RPs per group in single PIM-SM
domain. Further, any such mechanism should also address the issues
addressed above.
The mechanism described here is intended to address the need for The mechanism described here is intended to address the need for
redundancy and load sharing among RPs in a domain. It is primarily better fail-over (convergence time) and sharing of the register
intended for application within those networks which are using MBGP, decapsulation load (again, when using the shared-tree) among RPs in a
MSDP and PIM-SM protocols for native multicast deployment, although domain. It is primarily intended for applications within those
networks which are using MBGP, Multicast Source Discovery Protocol
[MSDP] and PIM-SM protocols for native multicast deployment, although
it not limited to those protocols. In particular, Anycast RP is it not limited to those protocols. In particular, Anycast RP is
applicable in any PIM-SM network that also supports MSDP (MSDP is applicable in any PIM-SM network that also supports MSDP (MSDP is
required so that the various RPs in the domain maintain a consistent required so that the various RPs in the domain maintain a consistent
view of the sources that are active). Note however, a domain view of the sources that are active). Note however, a domain
deploying Anycast RP is not required to run MBGP. deploying Anycast RP is not required to run MBGP.
5. Problem Definition 5. Problem Definition
The anycast RP solution provides a solution for both redundancy and The anycast RP solution provides a solution for both fast fail-over
load balancing among any number of active RPs in a domain. and shared-tree load balancing among any number of active RPs in a
domain.
5.1. Traffic Concentration and Load Balancing Between RPs 5.1. Traffic Concentration and Distributing Decapsulation Load Among RPs
While PIM-SM allows for multiple RPs to be defined for a given group, While PIM-SM allows for multiple RPs to be defined for a given group,
only one group to RP mapping can active at a given time. A only one group to RP mapping can active at a given time. A
traditional deployment mechanism for load balancing between multiple traditional deployment mechanism for balancing register decapsulation
RPs covering the multicast group space is to split up the 224.0.0.0/4 load between multiple RPs covering the multicast group space is to
space between multiple defined RPs. This is an acceptable solution as split up the 224.0.0.0/4 space between multiple defined RPs. This is
long as multicast traffic remains low, but has problems as multicast an acceptable solution as long as multicast traffic remains low, but
traffic increases, especially because the network operator defining has problems as multicast traffic increases, especially because the
group space split between RPs does not alway have a priori knowledge network operator defining group space split between RPs does not
of traffic distribution between groups. This can be overcome via alway have a priori knowledge of traffic distribution between groups.
periodic reconfigurations, but operational considerations cause this This can be overcome via periodic reconfigurations, but operational
type of solution to scale poorly. The other alternative to periodic considerations cause this type of solution to scale poorly.
reconfiguration is to split 224.0.0.0/4 space more finely between
more RPs, but this solution can have the disadvantage of creating
more complex RP configurations, along with the attendant operational
problems when RPs are configured [CLUSTERS].
5.2. Sub-optimal Forwarding of Multicast Packets 5.2. Sub-optimal Forwarding of Multicast Packets
When a single RP serves a given multicast group, all joins to that When a single RP serves a given multicast group, all joins to that
group will be sent to that RP regardless of the topological distance group will be sent to that RP regardless of the topological distance
between the RP and the sources and receivers. Initial data will be between the RP and the sources and receivers. Initial data will be
sent towards the RP also until configured shortest path tree switch sent towards the RP also until configured shortest path tree switch
threshold is is reached, or the data will always be sent towards the threshold is reached, or the data will always be sent towards the RP
RP if the network is configured to always use RP rooted shared tree. if the network is configured to always use RP rooted shared tree.
This holds true even if all the sources and the receivers are in any This holds true even if all the sources and the receivers are in any
given single region, and RP is topologically distant from the sources given single region, and RP is topologically distant from the sources
and the receivers. This is an artifact of the dynamic nature of and the receivers. This is an artifact of the dynamic nature of
multicast group members, and of the fact that operators may not multicast group members, and of the fact that operators may not
always have a priori knowledge of the topological placement of the always have a priori knowledge of the topological placement of the
group members. group members.
Taken together, these effects can mean that (for example) although Taken together, these effects can mean that (for example) although
all the sources and receivers of a given group are in Europe, they all the sources and receivers of a given group are in Europe, they
are joining towards the RP in USA and the data will be traversing are joining towards the RP in USA and the data will be traversing
relatively expensive pipe(s) twice, once to get to RP, and back down relatively expensive pipe(s) twice, once to get to RP, and back down
the RP rooted tree again, creating inefficient use of expensive the RP rooted tree again, creating inefficient use of expensive
resources. resources.
5.3. Distant RP Dependencies 5.3. Distant RP Dependencies
As outlined above, single active RP per group may cause local sources As outlined above, a single active RP per group may cause local
and receivers to become dependent on a topologically distant RP. In sources and receivers to become dependent on a topologically distant
case of a scenario where there are backup RPs configured, distant RP RP. In case of a scenario where there are backup RPs configured,
dependence can be created due to the failure of the primary RP, which distant RP dependence can be created due to the failure of the
is topologically closer, and may become exacerbated by switching to primary RP, which is topologically closer, and may become exacerbated
the backup RP, which may be even more distant topologically, which by switching to the backup RP, which may be even more distant
may lead to inferior performance, if not outright loss of topologically, which may lead to inferior performance, if not
connectivity to an RP serving the group, depending on the network outright loss of connectivity to an RP serving the group, depending
condition at the given moment. on the network condition at the given moment.
6. Solution 6. Solution
Given the problem set outlined above, a good solution would allow an Given the problem set outlined above, a good solution would allow an
operator to define multiple RPs per group, and distribute those RPs operator to configure multiple RPs per group, and distribute those
in a topologically significant manner to the sources and receivers. RPs in a topologically significant manner to the sources and
receivers.
6.1. Mechanisms 6.1. Mechanisms
All the RPs serving a given group or set of groups are configured All the RPs serving a given group or set of groups are configured
with identical unicast address, using a numbered interface on the RPs with identical unicast address, using a numbered interface on the RPs
(frequently a logical interface such as a loopback is used). RPs then (frequently a logical interface such as a loopback is used). RPs then
advertise group to RP mappings using this interface address. This advertise group to RP mappings using this interface address. This
will cause group members (senders) to join (register) towards the will cause group members (senders) to join (register) towards the
topologically closest RP. RPs MSDP peer with each other using the topologically closest RP. RPs MSDP peer with each other using an
unique shared addresses. Note that if the router implementation address unique to each RP. Note that if the router implementation
chooses the shared address for the BGP router ID, then BGP peerings chooses the anycast address as the router ID, then peerings and/or
will not be established. As a result, care should be taken to avoid adjacencies may not be established.
the ambiguity of the BGP router ID with the RP address (for example,
if the logical address chosen is the highest IP address configured on Operationally, the following steps are required:
the router, and the router implementation that automatically chooses
a router ID based upon highest IP address assigned to interfaces). 6.1.1. Create the set of group-to-anycast-RP-address mappings
Finally, the solution described here can be implemented without any
modification to existing protocols or their implementations. The first step is to create the set of group-to-anycast-RP-address
mappings to be used in the domain. Each RP participating in a anycast
RP set must be configured with a consistent set of group to RP
address mappings. This mapping will be used by the non-RP routers in
the domain.
6.1.2. Configure each RP for the group range with the anycast RP address
The next step is to configure each RP for the group range with the
anycast RP address. If a dynamic mechanism such as auto-RP or the
PIMv2 bootstrap mechanism is being used to advertise group to RP
mappings, the anycast IP address should be used for the RP address.
6.1.3. Configure MSDP peerings between each of the anycast RPs in the
set
Unlike the group to RP mapping advertisements, MSDP peerings must use
an IP address that is unique to the endpoints. A general guideline is
to follow the addressing of the BGP peerings, e.g., loopbacks for
iBGP peering, physical interface addresses for eBGP peering.
6.1.4. Configure the non-RP's with the group-to-anycast-RP-address
mappings
Finally, each non-RP router must learn the set of group to RP
mappings. This could be done via static configuration, auto-RP, or by
PIMv2 bootstrap mechanism.
6.1.5. Ensure that the anycast IP address is reachable by all routers in
the domain
This is typically accomplished by injecting the /32 into the domain's
IGP.
6.2. Interaction with MSDP Peer-RPF check 6.2. Interaction with MSDP Peer-RPF check
Each MSDP peer receives and forwards the message away from the RP Each MSDP peer receives and forwards the message away from the RP
address in a "peer-RPF flooding" fashion. The notion of peer-RPF address in a "peer-RPF flooding" fashion. The notion of peer-RPF
flooding is with respect to forwarding SA messages [MSDP]. The BGP or flooding is with respect to forwarding SA messages [MSDP]. The BGP
MBGP routing tables are examined to determine which peer is the next routing tables are examined to determine which peer is the next hop
hop towards the originating RP of the SA message. Such a peer is towards the originating RP of the SA message. Such a peer is called
called an "RPF peer". See [MSDP] for details of the Peer-RPF check. an "RPF peer". See [MSDP] for details of the Peer-RPF check.
6.3. Further Applications of Anycast RP mechanism 6.3. State Implications
It should be noted that using MSDP in this way forces the creation of
(S,G) state along the path from the receiver to the source. This
state may not be present if a single RP was used and receivers were
forced to stay on the shared tree.
6.4. Further Applications of Anycast RP mechanism
The solution described above can also be applied to external MSDP The solution described above can also be applied to external MSDP
peers that are used to join two PIM-SM domains together. This can peers that are used to join two PIM-SM domains together. This can
provide redundancy to the MSDP peering session, ease operational provide redundancy to the MSDP peering session, ease operational
complexity as well as simplify configuration management. A side complexity as well as simplify configuration management. A side
effect to be aware of with this design is that which of the effect to be aware of with this design is that which of the
configured MSDP sessions comes up will be determined via the unicast configured MSDP sessions comes up will be determined via the unicast
topology between two providers, and can be some what unpredictable. topology between two providers, and can be somewhat unpredictable. If
If any of the backup peering sessions resets, the active session will any of the backup peering sessions resets, the active session will
also reset. also reset.
7. Multicast State Scaling 7. Security considerations
Let k = m + r, where
r = registering to an RP
m = number internal sources learned through MSDP
p = number of anycast (internal) MSDP peers
For p = 1, m = 0
0 receivers ==> 1 (*,G) + 0 SAs
Greater than 1 receiver ==> k (S,G) + 0 SAs
For p > 1, m != 0
0 receivers ==> 1 (*,G) + m SAs
Greater than 1 receiver ==> k (S,G) + m SAs
Importantly, the multicast state growth is O(k), where k is not a
function of p, the number of anycast RP peers.
8. Security considerations
Since the solution described here makes heavy use of anycast Since the solution described here makes heavy use of anycast
addressing, care must be taken to avoid spoofing. In particular addressing, care must be taken to avoid spoofing. In particular
unicast routing and PIM RPs must be protected. unicast routing and PIM RPs must be protected.
8.1. Unicast Routing 7.1. Unicast Routing
Both internal and external unicast routing can be weakly protected Both internal and external unicast routing can be weakly protected
with keyed MD5 [RFC1828], as implemented in an internal protocol such with keyed MD5 [RFC1828], as implemented in an internal protocol such
as OSPF [RFC2382] or in BGP [RFC2385]. More generally, IPSEC as OSPF [RFC2382] or in BGP [RFC2385]. More generally, IPSEC
[RFC1825] could be used to provide protocol integrity for the unicast [RFC1825] could be used to provide protocol integrity for the unicast
routing system. routing system.
8.2. Multicast Protocol Integrity 7.1.1. Effects of Unicast Routing Instability
While not a security issue, it is worth noting that if unicast
routing is unstable, then the actual RP that source or receiver is
using will be subject to the same instability.
7.2. Multicast Protocol Integrity
The mechanisms described in [PIMAUTH] should be used to provide The mechanisms described in [PIMAUTH] should be used to provide
protocol message integrity protection and group-wise message origin protocol message integrity protection and group-wise message origin
authentication. authentication.
8.3. MSDP Peer Integrity 7.3. MSDP Peer Integrity
As is the the case for BGP, MSDP peers can be protected using keyed As is the the case for BGP, MSDP peers can be protected using keyed
MD5 [RFC1828]. MD5 [RFC1828].
9. Acknowledgments 8. Acknowledgments
John Meylor, Dave Thaler and Tom Pusateri provided insightful John Meylor, Dave Thaler and Tom Pusateri provided insightful
comments on earlier versions for this idea. comments on earlier versions for this idea.
10. References 9. References
[CLUSTERS] D. Farinacci, et. al., "Use of Anycast Clusters for
Inter-Domain Multicast Routing",
draft-ietf-farinacci-anycast-clusters-01.txt, March,
1998. ftp://ftpeng.cisco.com/ipmulticast/internet-drafts
[MSDP] D. Farinacci, et. al., "Multicast Source Discovery [MSDP] D. Farinacci, et. al., "Multicast Source Discovery
Protocol (MSDP)", draft-ietf-msdp-spec-02.txt, Protocol (MSDP)", draft-ietf-msdp-spec-02.txt,
November, 1999. November, 1999.
[PIMAUTH] L. Wei, et al., "Authenticating PIM version 2 messages", [PIMAUTH] L. Wei, et al., "Authenticating PIM version 2 messages",
draft-ietf-pim-v2-auth-00.txt, November, 1998. draft-ietf-pim-v2-auth-00.txt, November, 1998.
[RFC1825] Atkinson, R., "IP Security Architecture", August 1995. [RFC1825] Atkinson, R., "IP Security Architecture", August 1995.
skipping to change at page 7, line 23 skipping to change at page 8, line 5
2362, June, 1998. 2362, June, 1998.
[RFC2382] Moy, J., "OSPF Version 2", RFC 2382, April 1998. [RFC2382] Moy, J., "OSPF Version 2", RFC 2382, April 1998.
[RFC2385] Herrernan, A., "Protection of BGP Sessions via the TCP [RFC2385] Herrernan, A., "Protection of BGP Sessions via the TCP
MD5 Signature Option", RFC 2385, August, 1998. MD5 Signature Option", RFC 2385, August, 1998.
[RFC2403] C. Madson and R. Glenn, "The Use of HMAC-MD5-96 within [RFC2403] C. Madson and R. Glenn, "The Use of HMAC-MD5-96 within
ESP and AH", RFC 2403, November, 1998. ESP and AH", RFC 2403, November, 1998.
11. Author's Address 10. Author's Address
Dorian Kim Dorian Kim
Verio, Inc. Verio, Inc.
2361 Lancashire Dr. #2A 2361 Lancashire Dr. #2A
Ann Arbor, MI 48015 Ann Arbor, MI 48015
Email: dorian@blackrose.org Email: dorian@blackrose.org
Hank Kilmer Hank Kilmer
Email: hank@rem.com Email: hank@rem.com
Dino Farinacci Dino Farinacci
Email: dino@dinof.net Procket Networks
Email: dino@procket.com
David Meyer David Meyer
Cisco Systems, Inc. Cisco Systems, Inc.
170 Tasman Drive 170 Tasman Drive
San Jose, CA, 95134 San Jose, CA, 95134
Email: dmm@cisco.com Email: dmm@cisco.com
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/