draft-ietf-idr-bgp-optimal-route-reflection-03.txt   draft-ietf-idr-bgp-optimal-route-reflection-04.txt 
IDR Working Group R. Raszuk IDR Working Group R. Raszuk
Internet-Draft NTT MCL Internet-Draft NTT MCL
Intended status: Standards Track C. Cassar Intended status: Standards Track C. Cassar
Expires: May 10, 2013 Cisco Systems Expires: June 7, 2013 Cisco Systems
E. Aman E. Aman
TeliaSonera TeliaSonera
B. Decraene B. Decraene
France Telecom France Telecom
November 6, 2012 S. Litkowski
Orange
December 4, 2012
BGP Optimal Route Reflection (BGP-ORR) BGP Optimal Route Reflection (BGP-ORR)
draft-ietf-idr-bgp-optimal-route-reflection-03 draft-ietf-idr-bgp-optimal-route-reflection-04
Abstract Abstract
[RFC4456] asserts that, because the Interior Gateway Protocol (IGP) [RFC4456] asserts that, because the Interior Gateway Protocol (IGP)
cost to a given point in the network will vary across routers, "the cost to a given point in the network will vary across routers, "the
route reflection approach may not yield the same route selection route reflection approach may not yield the same route selection
result as that of the full IBGP mesh approach." One practical result as that of the full IBGP mesh approach." One practical
implication of this assertion is that the deployment of route implication of this assertion is that the deployment of route
reflection may thwart the ability to achieve hot potato routing. Hot reflection may thwart the ability to achieve hot potato routing. Hot
potato routing attempts to direct traffic to the closest AS egress potato routing attempts to direct traffic to the closest AS egress
skipping to change at page 2, line 17 skipping to change at page 2, line 20
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 10, 2013. This Internet-Draft will expire on June 7, 2013.
Copyright Notice Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 10 skipping to change at page 3, line 10
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Proposed solutions . . . . . . . . . . . . . . . . . . . . . . 5 2. Proposed solutions . . . . . . . . . . . . . . . . . . . . . . 5
3. Best path selection for BGP hot potato routing from 3. Best path selection for BGP hot potato routing from
customized IGP network position . . . . . . . . . . . . . . . 6 customized IGP network position . . . . . . . . . . . . . . . 7
3.1. Client's perspective best path selection algorithm . . . . 8 3.1. Client's perspective best path selection algorithm . . . . 8
3.1.1. Flat IGP network . . . . . . . . . . . . . . . . . . . 8 3.1.1. Flat IGP network . . . . . . . . . . . . . . . . . . . 8
3.1.2. Hierarchical IGP network . . . . . . . . . . . . . . . 8 3.1.2. Hierarchical IGP network . . . . . . . . . . . . . . . 9
3.2. Aside: Configuration-based flexible route reflector 3.2. Aside: Configuration-based flexible route reflector
placement . . . . . . . . . . . . . . . . . . . . . . . . 9 placement . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3. Route reflector client grouping . . . . . . . . . . . . . 10 3.3. Route reflector client grouping . . . . . . . . . . . . . 10
3.3.1. Route Reflector Client Group ID . . . . . . . . . . . 10 3.3.1. Route Reflector Client Group ID . . . . . . . . . . . 11
3.4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . 12 3.4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . 12
3.5. Advantages . . . . . . . . . . . . . . . . . . . . . . . . 12 3.5. Advantages . . . . . . . . . . . . . . . . . . . . . . . . 13
4. Angular distance approximation for BGP warm potato routing . 13 4. Angular distance approximation for BGP warm potato routing . 13
4.1. Problem statement . . . . . . . . . . . . . . . . . . . . 13 4.1. Problem statement . . . . . . . . . . . . . . . . . . . . 14
4.2. Proposed solution . . . . . . . . . . . . . . . . . . . . 14 4.2. Proposed solution . . . . . . . . . . . . . . . . . . . . 15
4.3. Centralized vs distributed route reflectors . . . . . . . 16 4.3. Centralized vs distributed route reflectors . . . . . . . 16
5. Deployment considerations . . . . . . . . . . . . . . . . . . 16 5. Client's perspective policy based best path selection . . . . 17
6. Security considerations . . . . . . . . . . . . . . . . . . . 17 5.1. Proposal . . . . . . . . . . . . . . . . . . . . . . . . . 18
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 5.2. Example . . . . . . . . . . . . . . . . . . . . . . . . . 18
8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 17 5.3. Avoiding routing loops . . . . . . . . . . . . . . . . . . 19
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 6. Deployment considerations . . . . . . . . . . . . . . . . . . 20
9.1. Normative References . . . . . . . . . . . . . . . . . . . 18 7. Security considerations . . . . . . . . . . . . . . . . . . . 21
9.2. Informative References . . . . . . . . . . . . . . . . . . 18 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21
10.1. Normative References . . . . . . . . . . . . . . . . . . . 21
10.2. Informative References . . . . . . . . . . . . . . . . . . 22
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22
1. Introduction 1. Introduction
There are three types of BGP deployments within Autonomous Systems There are three types of BGP deployments within Autonomous Systems
today: full mesh, confederations and route reflection. today: full mesh, confederations and route reflection.
BGP route reflection is the most popular way to distribute BGP routes BGP route reflection is the most popular way to distribute BGP routes
between BGP speakers belonging to the same administrative domain. between BGP speakers belonging to the same administrative domain.
Traditionally route reflectors have been deployed in the forwarding Traditionally route reflectors have been deployed in the forwarding
path and carefully placed on the POP to core boundaries. That model path and carefully placed on the POP to core boundaries. That model
skipping to change at page 5, line 31 skipping to change at page 5, line 31
number of EBGP peers over which full Internet routing information is number of EBGP peers over which full Internet routing information is
received would correlate directly to the number of paths present in received would correlate directly to the number of paths present in
each ASBR. This could easily result in tens of paths for each each ASBR. This could easily result in tens of paths for each
prefix. prefix.
Notwithstanding this drawback, there are a number of reasons for Notwithstanding this drawback, there are a number of reasons for
sending more than just the single best path to the clients. Improved sending more than just the single best path to the clients. Improved
path diversity at the edge is a requirement for fast connectivity path diversity at the edge is a requirement for fast connectivity
restoration, and a requirement for effective BGP level load restoration, and a requirement for effective BGP level load
balancing. Protocol extensions like add-paths balancing. Protocol extensions like add-paths
[I-D.ietf-idr-add-paths] or diverse-path [I-D.ietf-idr-add-paths] or [RFC6774] diverse-path allow for such
[I-D.ietf-grow-diverse-bgp-path-dist] allow for such improved path improved path diversity and can be used to address the same problems
diversity and can be used to address the same problems addressed by addressed by the mechanisms proposed in this draft.
the mechanisms proposed in this draft. In practical terms, add/
diverse path deployments are expected to result in the distribution In practical terms, add/diverse path deployments are expected to
of 2, 3 or n (where n is a small number) 'good' paths rather than all result in the distribution of 2, 3 or n (where n is a small number)
domain external paths. While the route reflector chooses one set of 'good' paths rather than all domain external paths. While the route
n paths and distributes those same n paths to all its route reflector reflector chooses one set of n paths and distributes those same n
clients, those n paths may not be the right n paths for all clients. paths to all its route reflector clients, those n paths may not be
In the context of the problem described above, those n paths will not the right n paths for all clients. In the context of the problem
necessarily include the closest egress point out of the network for described above, those n paths will not necessarily include the
each route reflector client. The mechanisms proposed in this closest egress point out of the network for each route reflector
document are likely to be complementary to mechanisms aimed at client. The mechanisms proposed in this document are likely to be
improving path diversity. complementary to mechanisms aimed at improving path diversity.
2. Proposed solutions 2. Proposed solutions
This document proposes two simple solutions to the problem described This document proposes two simple solutions to the problem described
above. Both of these solutions make it possible for route reflector above. Both of these solutions make it possible for route reflector
clients to direct traffic to their closest exit point in hot potato clients to direct traffic to their closest exit point in hot potato
routing deployments, without requiring further state to be pushed out routing deployments, without requiring further state to be pushed out
to the edge. These solutions are primarily applicable in deployments to the edge. These solutions are primarily applicable in deployments
using centralized route reflectors, which are typically implemented using centralized route reflectors, which are typically implemented
in devices without a capable forwarding plane. in devices without a capable forwarding plane.
skipping to change at page 6, line 36 skipping to change at page 6, line 36
BGP next hop comparison step. Clearly the overall path selection BGP next hop comparison step. Clearly the overall path selection
preference may be chosen based other policy step and provisions as preference may be chosen based other policy step and provisions as
defined in this document would not apply. defined in this document would not apply.
In the respective solutions the choice is made either factoring in In the respective solutions the choice is made either factoring in
IGP costs or the configured angular distance to the next hop. The IGP costs or the configured angular distance to the next hop. The
route reflector makes different decisions for different clients only route reflector makes different decisions for different clients only
in the case where the tie breaker for path selection would have been in the case where the tie breaker for path selection would have been
the IGP distance to the BGP nexthop (as in hot potato routing). the IGP distance to the BGP nexthop (as in hot potato routing).
A signficant advantage of this approach is that the RR clients do not A significant advantage of this approach is that the RR clients do
need to run new software or hardware. not need to run new software or hardware.
Besides these solutions to manage hot potato routing, there are
deployment scenarios where service providers want to have more
control of traffic exiting the AS by assigning per client preference
to gateways.
This document proposes to introduce a solution to perform a policy
based route-reflection to address those scenarios. This solution has
the same requirements (regarding path diversity) and advantages than
the two IGP metric based solutions.
3. Best path selection for BGP hot potato routing from customized IGP 3. Best path selection for BGP hot potato routing from customized IGP
network position network position
This section describes a method for calculating the order of This section describes a method for calculating the order of
preference of BGP paths from the point of view of each separate route preference of BGP paths from the point of view of each separate route
reflector client. More specifically, the route relflector will reflector client. More specifically, the route reflector will
compute the IGP metric to the BGP nexthop from the position of the compute the IGP metric to the BGP nexthop from the position of the
client to which the resulting path will be distributed, if the IGP client to which the resulting path will be distributed, if the IGP
metric is the tie breaker applied to a set of possible paths. In the metric is the tie breaker applied to a set of possible paths. In the
subsequent model authors will propose virtual reflector placement at subsequent model authors will propose virtual reflector placement at
operator's selected IGP location. operator's selected IGP location.
In the case of a hierarchical IGP deployment where the client is in a In the case of a hierarchical IGP deployment where the client is in a
different level in the hierarchy to the route reflector, the route different level in the hierarchy to the route reflector, the route
reflector will compute IGP distance to the BGP nexthop from the Area reflector will compute IGP distance to the BGP nexthop from the Area
Border Routers (ABR) leading to the client in lieu of the route Border Routers (ABR) leading to the client in lieu of the route
skipping to change at page 8, line 49 skipping to change at page 9, line 12
is selected from a set for a given prefix while ignoring is selected from a set for a given prefix while ignoring
differences in distance within the tolerance figure, then that differences in distance within the tolerance figure, then that
same path must always be preferred for all clients where the paths same path must always be preferred for all clients where the paths
are within the tolerance figure are within the tolerance figure
3.1.2. Hierarchical IGP network 3.1.2. Hierarchical IGP network
Hierarchy introduces two challenges: Hierarchy introduces two challenges:
The first challenge is that the RR IGP view may differ from a The first challenge is that the RR IGP view may differ from a
client IGP view by virtue of one or the other having a summarised client IGP view by virtue of one or the other having a summarized
view versus the other. Summarisation, by its nature, loses view versus the other. Summarization, by its nature, loses
information. Consider the example where a client within a PoP information. Consider the example where a client within a PoP
sees two prefixes with two metrics for two egress points within sees two prefixes with two metrics for two egress points within
the PoP, but where the RR only sees a single summary covering the PoP, but where the RR only sees a single summary covering
reachability to both nexthops as injected by the ABR. For reachability to both nexthops as injected by the ABR. For
clarification purposes in the case of ISIS by ABR we refer to clarification purposes in the case of ISIS by ABR we refer to
L1/L2 node. However it needs to be observed that inter area L1/L2 node. However it needs to be observed that inter area
networks running LDP are required to disable summarisation of all networks running LDP are required to disable summarisation of all
FEC advertised in LDP (typically all loopbacks) unless [RFC5283] FEC advertised in LDP (typically all loopbacks) unless [RFC5283]
is deployed. Such deployments are not likely to suffer is deployed. Such deployments are not likely to suffer
summarisation difficulties. summarization difficulties.
The second challenge is that in cases where the client is in a The second challenge is that in cases where the client is in a
different level of hierarchy from the RR, the RR can not build a different level of hierarchy from the RR, the RR can not build a
Shortest Path First (SPF) tree with the client node as root, Shortest Path First (SPF) tree with the client node as root,
simply because the topology derived by the IGP will not include simply because the topology derived by the IGP will not include
the client node. It will instead only include reachability to the the client node. It will instead only include reachability to the
client from one or more ABRs. In order to overcome this problem, client from one or more ABRs. In order to overcome this problem,
the RR could compute an SPF tree from the ABRs in the area. The the RR could compute an SPF tree from the ABRs in the area. The
RR would then determine the shortest distance from a client which RR would then determine the shortest distance from a client which
lives behind the ABRs, to a nexthop, by adding the advertised lives behind the ABRs, to a nexthop, by adding the advertised
skipping to change at page 10, line 22 skipping to change at page 10, line 34
effected on a per-AF or AF plus update/peer group granularity. It effected on a per-AF or AF plus update/peer group granularity. It
should be noted that this approach provides for splitting one should be noted that this approach provides for splitting one
centralized route reflector such that it is virtually positioned at centralized route reflector such that it is virtually positioned at
various network locations, with the network location depending upon various network locations, with the network location depending upon
of address family or address family plus update/peer group. of address family or address family plus update/peer group.
Virtual slicing of a centralized route reflector relaxes the need to Virtual slicing of a centralized route reflector relaxes the need to
propagate all BGP paths between RRs in a alternative conventional propagate all BGP paths between RRs in a alternative conventional
distributed RR deployment. It is expected that such RRs would be distributed RR deployment. It is expected that such RRs would be
deployed in redundant sets, and that those RRs would not need to be deployed in redundant sets, and that those RRs would not need to be
physically colocated, while still benefiting from the possibility of physically collocated, while still benefiting from the possibility of
being logically colocated, and therefore not compromising any of the being logically collocated, and therefore not compromising any of the
best path selection symmetry. best path selection symmetry.
3.3. Route reflector client grouping 3.3. Route reflector client grouping
It may be appropriate to allow the operator, or the route reflector It may be appropriate to allow the operator, or the route reflector
itself, to group clients together using IGP distance between clients itself, to group clients together using IGP distance between clients
to determine grouping. All the operation discussed above which to determine grouping. All the operation discussed above which
relied upon computing best path for each client, and measuring relied upon computing best path for each client, and measuring
distances from each client to different nexthops, would instead be distances from each client to different nexthops, would instead be
performed for each group of clients. Configurable thresholds can be performed for each group of clients. Configurable thresholds can be
used to determine which IGP metric changes should be visible to BGP, used to determine which IGP metric changes should be visible to BGP,
and trigger best paths recomputation. The latter would be beneficial and trigger best paths recomputation. The latter would be beneficial
in existing BGP RR code too. in existing BGP RR code too.
Alternatively route reflector client grouping could be accomplished Alternatively route reflector client grouping could be accomplished
statically by the operator by coloring clients belonging to a common statically by the operator by coloring clients belonging to a common
group (for example being part of the same POP). In order to group (for example being part of the same POP). In order to
accomplish such marking it is proposed that BGP OPEN message be accomplish such marking it is proposed that BGP OPEN message be
augmented with an optional paramiter indicating the Group ID given augmented with an optional parameter indicating the Group ID given
peer belongs to. peer belongs to.
3.3.1. Route Reflector Client Group ID 3.3.1. Route Reflector Client Group ID
This is an Optional Parameter in BGP OPEN message that is used by a This is an Optional Parameter in BGP OPEN message that is used by a
BGP speaker to convey to its route reflectors the Group ID value. BGP speaker to convey to its route reflectors the Group ID value.
Such value will allow automatic and predictable peer grouping on the Such value will allow automatic and predictable peer grouping on the
route reflectors as deemed necessary from operator's network route reflectors as deemed necessary from operator's network
architecture. architecture.
skipping to change at page 12, line 47 skipping to change at page 13, line 18
The solution described provides a model for integrating the client The solution described provides a model for integrating the client
perspective into the best path computation for RRs. More perspective into the best path computation for RRs. More
specifically, the choice or BGP path factors in the IGP metric specifically, the choice or BGP path factors in the IGP metric
between the client and the nexthop, rather than the distance from the between the client and the nexthop, rather than the distance from the
RR to the nexthop. The documented method does not require any BGP or RR to the nexthop. The documented method does not require any BGP or
IGP protocol changes as required changes are contained within the RR IGP protocol changes as required changes are contained within the RR
implementation. implementation.
This solution can be deployed in traditional hop-by-hop forwarding This solution can be deployed in traditional hop-by-hop forwarding
networks as well as in end-to-end tunneled environments. In the networks as well as in end-to-end tunneled environments. In the
networks where there are multiple route reflectors and unencapsulated networks where there are multiple route reflectors and hop-by-hop
hop-by-hop forwarding, such optimizations should be enabled on all forwarding without encapsulation, such optimizations should be
route reflectors. Otherwise clients may receive an inconsistent view enabled on all route reflectors. Otherwise clients may receive an
of the network and in turn lead to intra-domain forwarding loops. inconsistent view of the network and in turn lead to intra-domain
forwarding loops.
With this approach, an ISP can effect a hot potato routing policy With this approach, an ISP can effect a hot potato routing policy
even if route reflection has been moved from the forwarding plane to even if route reflection has been moved from the forwarding plane to
the core and hop-by-hop switching has been replaced by end to end the core and hop-by-hop switching has been replaced by end to end
MPLS or IP encapsulation. MPLS or IP encapsulation.
As per above, the approach reduces the amount of state which needs to As per above, the approach reduces the amount of state which needs to
be pushed to the edge in order to perform hot potato routing. The be pushed to the edge in order to perform hot potato routing. The
memory and CPU resource required at the edge to provide hot potato memory and CPU resource required at the edge to provide hot potato
routing using this approach is lower than what would be required in routing using this approach is lower than what would be required in
skipping to change at page 13, line 29 skipping to change at page 13, line 48
plane route reflection without compromising an operator's closest plane route reflection without compromising an operator's closest
exit operational principle. Hot potato routing is important to most exit operational principle. Hot potato routing is important to most
ISPs. The inability to perform hot potato routing effectively stops ISPs. The inability to perform hot potato routing effectively stops
migrations to centralized route reflection and edge-to-edge LSP/IP migrations to centralized route reflection and edge-to-edge LSP/IP
encapsulation for traffic to IPv4 and IPv6 prefixes. encapsulation for traffic to IPv4 and IPv6 prefixes.
4. Angular distance approximation for BGP warm potato routing 4. Angular distance approximation for BGP warm potato routing
This section describes an alternative solution to the use of IGP This section describes an alternative solution to the use of IGP
topology information to virtually position the RR at the client topology information to virtually position the RR at the client
location in the network. This solution involves modelling the location in the network. This solution involves modeling the network
network topology as a set of elements (regions, PoPs or routers) topology as a set of elements (regions, PoPs or routers) arranged in
arranged in a circle. Route reflector clients and inter-domain exit a circle. Route reflector clients and inter-domain exit points would
points would then be statically assigned to those elements such that then be statically assigned to those elements such that one can
one can compute the angular distance between route-reflector clients compute the angular distance between route-reflector clients and the
and the various exit points in order to infer the distance between various exit points in order to infer the distance between any two
any two elements. This measure of distance can be used as an elements. This measure of distance can be used as an effective
effective alternative to the IGP distance as a tie breaker in the alternative to the IGP distance as a tie breaker in the path
path selection algorithm if necessary. selection algorithm if necessary.
4.1. Problem statement 4.1. Problem statement
This solution addresses the problem described in earlier sections, This solution addresses the problem described in earlier sections,
while attempting to minimize computational overhead. The aim of the while attempting to minimize computational overhead. The aim of the
proposed solution is to enable a route reflector to provide a route proposed solution is to enable a route reflector to provide a route
reflector client with an exit point for a prefix which is 'closest' reflector client with an exit point for a prefix which is 'closest'
to the client rather than the route-reflector, without having to to the client rather than the route-reflector, without having to
distribute all paths to that client, or having to derive each distribute all paths to that client, or having to derive each
client's view of the network topology. The measure of closest is client's view of the network topology. The measure of closest is
skipping to change at page 15, line 7 skipping to change at page 15, line 27
optimality of routing on the other. The finest granularity possible optimality of routing on the other. The finest granularity possible
will be the relative position of originating clients. will be the relative position of originating clients.
Note that this solution has nothing to do with actual IGP link Note that this solution has nothing to do with actual IGP link
metrics and resulting topology in the network. metrics and resulting topology in the network.
It can be shown that for each network topology, elements such as AS It can be shown that for each network topology, elements such as AS
exit points can be mapped on to a circle. By putting POPs, Regions exit points can be mapped on to a circle. By putting POPs, Regions
or individual clients onto the hypothetical circle we can identify an or individual clients onto the hypothetical circle we can identify an
angular location for each element relative to some fixed direction; angular location for each element relative to some fixed direction;
for example defining the angular north of the circle at 0 degress. for example defining the angular north of the circle at 0 degrees.
The angular position of elements in the network can be conveyed to a The angular position of elements in the network can be conveyed to a
route reflector in a number of ways: route reflector in a number of ways:
Assignment of angular position of each RR client through Assignment of angular position of each RR client through
configuration on the route reflector itself; per client configuration on the route reflector itself; per client
configuration on RR configuration on RR
Assignment of angular position of an RR client at each client, Assignment of angular position of an RR client at each client,
then propagating it to RRs. then propagating it to RRs.
skipping to change at page 16, line 34 skipping to change at page 17, line 11
used to encode and propagate angular position between 0 and 359 of used to encode and propagate angular position between 0 and 359 of
a client. This community is only relevant to the route reflectors a client. This community is only relevant to the route reflectors
of a given BGP domain and should be stripped either at the ASBR of a given BGP domain and should be stripped either at the ASBR
boundary or when propagating updates to BGP peers which are not boundary or when propagating updates to BGP peers which are not
route reflectors. route reflectors.
The angular position marking could also be added by clients and The angular position marking could also be added by clients and
advertised to the route reflector. This would require some advertised to the route reflector. This would require some
configuration effort. configuration effort.
5. Deployment considerations 5. Client's perspective policy based best path selection
There is some deployment scenarios where a service provider wants to
achieve a stronger control on traffic exiting the AS (for capacity
planning) rather than using hot potato routing based on IGP metric.
| | | |
| | | |
GW1 GW2 GW3 GW4
RR1 RR2
R1 R2 R3
Considering the figure above, all gateways have iBGP sessions to RR1
and RR2, and R1 R2 R3 have iBGP sessions as well to RR1 and RR2.
Gateway routers are meshed to an external network (for example, a
transit service provider).
We would like to achieve a strong control on the gateway used
(primary and backup) for each router (or each set of routers) in the
network (taking into account that routers do not support ADD PATHs).
For example, R1 using GW1 as primary and GW2 as backup; R2 using GW2
as primary and GW3 as backup; R3 using GW3 as primary and GW4 as
backup.
Basically, today a prefix P1 is received on each gateway from the
external network. Each gateway will send the prefix to both route
reflectors. Each route-reflector will receive four paths for P1 and
choose the best one based on his own decision process. Note that RR1
and RR2 may choose a different path as best. Each route-reflector
sends his best path towards R1, R2 and R3. Each router will receive
the same paths from the route-reflectors for P1 (at max, only two
gateways are visible from Rx routers). So default behavior does not
fit our requirements in term of traffic flows.
Using current BGP mechanisms available, we could achieve our
requirements using two solutions :
o Modify the BGP meshing: for example, R1 meshed directly to GW1 and
GW2 and apply inbound policies on R1; R2 meshed directly to GW2
and GW3 and apply inbound policies on R2 ...
o Adding more route-reflectors (one RR per gateway used as primary)
and applying inbound policies on RRs to make each RR choosing a
different primary gateway and apply policies on routers to select
his own primary gateway.
These solutions have many drawbacks: first one is not flexible (re-
meshing needed when we want to change gateway of a router), second
one requires a lot of CAPEX.
We would like to introduce a solution where a single currently
deployed route-reflector chassis may take a different best path
decision for different set of clients based on preferences.
It should be noted that in simple scenarios (example: two RRs and two
gateways), RFC6774 would be able to fulfill service provider needs.
The solution proposed here would permit to handle more complex
scenarios and fine gateway choice per client or groups of clients.
5.1. Proposal
Our proposal is to reuse the concept introduced in [I.D.ietf-idr-ix-
bgp-route-server] in an iBGP context. To perform per client best
path selection, the router should maintain a per client BGP local-RIB
(or Adj-RIB-Out) associated with inbound policies implemented between
Adj-RIB-In and client LOC-RIB.
It would not be very scalable to use a per client policy (considering
hundreds of peers on a route-reflector), therefor our proposal is to
group clients sharing common policies inside a client group to
minimize computation/memory overhead. Client grouping could be done
statically (by configuration) or dynamically using the solution
described in section 3.3.1 of this document. Client grouping would
be performed with a per AFI/SAFI granularity as gateway/client
mapping may change in each AFI/SAFI context. A route-reflector
should be able to implement multiple client groups (with associated
inbound policies) as well as a default client group for clients that
does not require any specific policy decision: in this case, the
overall BGP best path computation would be used.
5.2. Example
GW1 GW2 GW3
\ | /
\ | /
RR1
/ | \
R1 R2 R3
In the above figure GW1, GW2, GW3 and R3 are standard ibgp route-
reflector clients. R1 and R2 want to use a special gateway
combination (primary GW3, backup GW2, last resort GW1). R1 and R2
are configured in a specific client group CG1 on the route-reflector
while other peers are in the default client group. CG1 is associated
with a policy achieving the expected GW preference for R1 and R2, and
letting other paths without any change.
All routes received by RR1 (ebgp, ibgp, ibgp rr client, ibgp rr
client routing context) must be evaluated using overall BGP best path
computation as well as in client group, the client group policy will
accept or not the route to be evaluated by the local decision
process.
o Paths from GW1, GW2, GW3 are compared within default client group
leading to one GW (for example GW1) to be selected as best and
installed in global LOC-RIB. GW1 path will be advertised to GW2,
GW3 and R3 as they are in default CG. In CG1, preference of GW
paths has been modified, leading to GW3 being the best path and
installed in client group LOC-RIB. GW3 path will be advertised to
R1 and R2, as R1 and R2 are part of CG1.
o Paths from R3 are compared within default client group and
advertised to GW1, GW2, GW3. Those paths are also compared within
CG1 (as accepted by policy) and advertised to R1 and R2.
o Paths from R1 are compared within default client group and
advertised to GW1, GW2, GW3 and R3. Those paths are also compared
within GG1 (as accepted by policy) and advertised to R2.
o Paths from R2 are compared within default client group and
advertised to GW1, GW2, GW3 and R3. Those paths are also compared
within CG1 (as accepted by policy) and advertised to R1.
5.3. Avoiding routing loops
Compared to the IGP approaches described in this document, the policy
based route-reflection should be limited to end-to-end encapsulation
environments to avoid intra-domain forwarding loops. Using end-to-
end encapsulation permit Edge routers to transport the traffic to the
targeted/preferred ASBR without any loop in the core.
To avoid a potential rerouting of the ASBR into the core (and
possible loop between Edges and ASBR), we must enforce forwarding at
the ASBR to the eBGP peer. This could be done by :
o implementing policies on ASBR to prefer eBGP path and install it
in FIB.
o implementing tunneling of traffic until the outside interface
(ASBR action to switch to outside interface).
The exact choice of encapsulation and techniques to prevent transport
loops (including potential loops at gateways) is left to the operator
choice and its specification is outside of the scope of this
document.
6. Deployment considerations
The solutions are primarily intended for end-to-end tunneled The solutions are primarily intended for end-to-end tunneled
environments, i.e. where traffic is label switched or IP tunneled environments, i.e. where traffic is label switched or IP tunneled
across the core. If unencapsulated hop-by-hop forwarding is used, across the core. If unencapsulated hop-by-hop forwarding is used,
either misconfigurations or conflicts between these optimizations and either misconfigurations or conflicts between these optimizations and
classical BGP path selection rules could lead to intra-domain classical BGP path selection rules could lead to intra-domain
forwarding loops. Under certain circumstances the solutions can also forwarding loops. Under certain circumstances the solutions can also
be deployable without end-to-end tunneling. In particular the best be deployable without end-to-end tunneling. In particular the best
path selection based on the client's IGP best-path selection is path selection based on the client's IGP best-path selection is
guaranteed not to cause any forwarding loops (other than micro loops guaranteed not to cause any forwarding loops (other than micro loops
associated with reconvergence) when deployed in a flat IGP area associated with reconvergence) when deployed in a flat IGP area
provided that no distance tolerance value is used so that the path provided that no distance tolerance value is used so that the path
choice is truly made on a per-client basis. choice is truly made on a per-client basis.
It should be self evident that this solution does not interfere with Regarding potential intra-domain forwarding loops at ASBR level, this
policies enforced above IGP tie breaking in the BGP best path could be solved by enforcing external route preference or by
algorithm. performing tunnel to external interface switching action on ASBRs.
Regarding client's IGP best-path selection, it should be self evident
that this solution does not interfere with policies enforced above
IGP tie breaking in the BGP best path algorithm.
The solution applies to NLRIs of all address families which can be The solution applies to NLRIs of all address families which can be
route reflected and which can be tie broken by IGP distance to the route reflected.
nexthop.
It should be noted that customized per-client or group of clients It should be noted that customized per-client or group of clients
best path selection is already in use today in the context of best path selection is already in use today in the context of
Internet Exchange Point (IXP) route servers. In an IXP route server Internet Exchange Point (IXP) route servers. In an IXP route server
the client best path is selected as a result of different policies the client best path is selected as a result of different policies
rather than IGP metric distance to BGP next hop. rather than IGP metric distance to BGP next hop.
A possible scalability impact of optimizing path selection to take A possible scalability impact of optimizing path selection to take
account of the RR client position is that different RR clients account of the RR client position or operator's policy based
receive different paths, and therefore update/peer group efficiency preference is that different RR clients receive different paths, and
diminishes. This cost is imposed by the requirement given the therefore update/peer group efficiency diminishes. This cost is
requirement is to optimize the egress path from the client's imposed by the requirement to optimize the egress path from the
perspective. It is also not unlikely that groups of clients will end client's perspective. It is also likely that groups of clients will
up receiving the same best path/s, in which case, inefficiency of end up receiving the same best path/s, in which case, inefficiency of
update generation will be minimized. It should be noted that in the update generation will be minimized. It should be noted that in the
cases described under flexible router placement where placement is cases described under flexible router placement where placement is
determined on a per update/peer group basis or per route reflector, determined on a per update/peer group basis or per route reflector,
the scale benefits of peer groupings are retained. the scale benefits of peer groupings are retained.
6. Security considerations 7. Security considerations
No new security issues are introduced to the BGP protocol by this No new security issues are introduced to the BGP protocol by this
specification. specification.
7. IANA Considerations 8. IANA Considerations
IANA is requested to allocate a type code for the Standard BGP IANA is requested to allocate a type code for the Standard BGP
Community to be used for inter cluster propagation of angular Community to be used for inter cluster propagation of angular
position of the clients. position of the clients.
IANA is requested to allocate a new type code from BGP OPEN Optional IANA is requested to allocate a new type code from BGP OPEN Optional
Parameter Types registry to be used for Group_ID propagation. Parameter Types registry to be used for Group_ID propagation.
8. Acknowledgments 9. Acknowledgments
Authors would like to thank Eric Rosen, Clarence Filsfils, Uli Authors would like to thank Eric Rosen, Clarence Filsfils, Uli
Bornhauser Russ White, Jakob Heitz and Mike Shand for their valuable Bornhauser Russ White, Jakob Heitz and Mike Shand for their valuable
input. input.
9. References 10. References
9.1. Normative References 10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
Protocol 4 (BGP-4)", RFC 4271, January 2006. Protocol 4 (BGP-4)", RFC 4271, January 2006.
[RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended
Communities Attribute", RFC 4360, February 2006. Communities Attribute", RFC 4360, February 2006.
[RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement
with BGP-4", RFC 5492, February 2009. with BGP-4", RFC 5492, February 2009.
9.2. Informative References 10.2. Informative References
[I-D.ietf-grow-diverse-bgp-path-dist]
Raszuk, R., Fernando, R., Patel, K., McPherson, D., and K.
Kumaki, "Distribution of diverse BGP paths.",
draft-ietf-grow-diverse-bgp-path-dist-08 (work in
progress), July 2012.
[I-D.ietf-idr-add-paths] [I-D.ietf-idr-add-paths]
Walton, D., Chen, E., Retana, A., and J. Scudder, Walton, D., Chen, E., Retana, A., and J. Scudder,
"Advertisement of Multiple Paths in BGP", "Advertisement of Multiple Paths in BGP",
draft-ietf-idr-add-paths-07 (work in progress), June 2012. draft-ietf-idr-add-paths-07 (work in progress), June 2012.
[RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP [RFC1997] Chandrasekeran, R., Traina, P., and T. Li, "BGP
Communities Attribute", RFC 1997, August 1996. Communities Attribute", RFC 1997, August 1996.
[RFC1998] Chen, E. and T. Bates, "An Application of the BGP [RFC1998] Chen, E. and T. Bates, "An Application of the BGP
skipping to change at page 19, line 12 skipping to change at page 22, line 45
[RFC5283] Decraene, B., Le Roux, JL., and I. Minei, "LDP Extension [RFC5283] Decraene, B., Le Roux, JL., and I. Minei, "LDP Extension
for Inter-Area Label Switched Paths (LSPs)", RFC 5283, for Inter-Area Label Switched Paths (LSPs)", RFC 5283,
July 2008. July 2008.
[RFC5668] Rekhter, Y., Sangli, S., and D. Tappan, "4-Octet AS [RFC5668] Rekhter, Y., Sangli, S., and D. Tappan, "4-Octet AS
Specific BGP Extended Community", RFC 5668, October 2009. Specific BGP Extended Community", RFC 5668, October 2009.
[RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework", [RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework",
RFC 5714, January 2010. RFC 5714, January 2010.
[RFC6774] Raszuk, R., Fernando, R., Patel, K., McPherson, D., and K.
Kumaki, "Distribution of Diverse BGP Paths", RFC 6774,
November 2012.
Authors' Addresses Authors' Addresses
Robert Raszuk Robert Raszuk
NTT MCL NTT MCL
101 S Ellsworth Avenue Suite 350 101 S Ellsworth Avenue Suite 350
San Mateo, CA 94401 San Mateo, CA 94401
US US
Email: robert@raszuk.net Email: robert@raszuk.net
skipping to change at page 19, line 41 skipping to change at page 23, line 34
TeliaSonera TeliaSonera
Marbackagatan 11 Marbackagatan 11
Farsta, SE-123 86 Farsta, SE-123 86
Sweden Sweden
Email: erik.aman@teliasonera.com Email: erik.aman@teliasonera.com
Bruno Decraene Bruno Decraene
France Telecom France Telecom
38-40 rue du General Leclerc 38-40 rue du General Leclerc
Issi Moulineaux cedex 9, 92794 Issy les Moulineaux cedex 9, 92794
France France
Email: bruno.decraene@orange-ftgroup.com Email: bruno.decraene@orange.com
Stephane Litkowski
Orange
9 rue du chene germain
Cesson Sevigne, 35512
France
Email: stephane.litkowski@orange.com
 End of changes. 34 change blocks. 
81 lines changed or deleted 246 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/