draft-ietf-idr-route-reflect-00.txt   draft-ietf-idr-route-reflect-01.txt 
INTERNET-DRAFT Tony Bates INTERNET-DRAFT Tony Bates
<draft-ietf-idr-route-reflect-00.txt> MCI <draft-ietf-idr-route-reflect-01.txt> MCI
Ravi Chandra Ravi Chandra
cisco Systems cisco Systems
March 1996 March 1996
BGP Route Reflection BGP Route Reflection
An alternative to full mesh IBGP An alternative to full mesh IBGP
<draft-ietf-idr-route-reflect-00.txt> <draft-ietf-idr-route-reflect-01.txt>
Status of this Memo Status of this Memo
This document is an Internet Draft. Internet Drafts are working This document is an Internet Draft. Internet Drafts are working
documents of the Internet Engineering Task Force (IETF), its Areas, documents of the Internet Engineering Task Force (IETF), its Areas,
and its Working Groups. Note that other groups may also distribute and its Working Groups. Note that other groups may also distribute
working documents as Internet Drafts. working documents as Internet Drafts.
Internet Drafts are draft documents valid for a maximum of six Internet Drafts are draft documents valid for a maximum of six
months. Internet Drafts may be updated, replaced, or obsoleted by months. Internet Drafts may be updated, replaced, or obsoleted by
skipping to change at page 2, line 14 skipping to change at page 2, line 14
1. Introduction 1. Introduction
Currently in the Internet today, BGP deployments are configured such Currently in the Internet today, BGP deployments are configured such
that that all BGP speakers within a single AS must be fully meshed that that all BGP speakers within a single AS must be fully meshed
and any external routing information must be re-distributed to all and any external routing information must be re-distributed to all
other routers within that AS. This "full mesh" requirement clearly other routers within that AS. This "full mesh" requirement clearly
does not scale when there are a large number of IBGP speakers as is does not scale when there are a large number of IBGP speakers as is
common in many of todays internet networks. common in many of todays internet networks.
For n BGP speakers within an AS you must maintain n*n-1/2 unique IBGP For n BGP speakers within an AS you must maintain n*(n-1)/2 unique
sessions. With finite resources in both bandwidth and router CPU this IBGP sessions. With finite resources in both bandwidth and router CPU
clearly does not scale. this clearly does not scale.
This scaling problem has been well documented and a number of This scaling problem has been well documented and a number of
proposals have been made to alleviate this [2,3]. This document proposals have been made to alleviate this [2,3]. This document
represents another alternative in alleviating the need for a "full represents another alternative in alleviating the need for a "full
mesh" and is known as "Route Reflection". It represents a change in mesh" and is known as "Route Reflection". It represents a change in
the commonly understood concept of IBGP and the addition of two new the commonly understood concept of IBGP and the addition of two new
optional transitive BGP attributes. optional transitive BGP attributes.
2. Design Criteria 2. Design Criteria
skipping to change at page 2, line 43 skipping to change at page 2, line 43
o Easy Migration o Easy Migration
It must be possible to migrate from a full mesh It must be possible to migrate from a full mesh
configuration without the need to change either topology configuration without the need to change either topology
or AS. This is an unfortunate management overhead of the or AS. This is an unfortunate management overhead of the
technique proposed in [3]. technique proposed in [3].
o Compatibility o Compatibility
It must be possible for non compliment IBGP peers It must be possible for non compliant IBGP peers
to continue be part of the original AS or domain to continue be part of the original AS or domain
without any loss of BGP routing information. without any loss of BGP routing information.
These criteria were motivated by operational experiences of a very These criteria were motivated by operational experiences of a very
large and topology rich network with many external connections. large and topology rich network with many external connections.
3. Route Reflection 3. Route Reflection
The basic idea of Route Reflection is very simple. Let us consider The basic idea of Route Reflection is very simple. Let us consider
the simple example depicted in Figure 1 below. the simple example depicted in Figure 1 below.
skipping to change at page 3, line 26 skipping to change at page 3, line 26
IBGP \ ASX / IBGP IBGP \ ASX / IBGP
\ / \ /
+-------+ +-------+
| | | |
| RTR-C | | RTR-C |
| | | |
+-------+ +-------+
Figure 1: Full Mesh IBGP Figure 1: Full Mesh IBGP
In ASX there are three IBGP speakers (routers RTR-A, RTR-B and RTR-C) In ASX there are three IBGP speakers (routers RTR-A, RTR-B and RTR-
and each B, C). With the existing BGP model, if RTR-A receives an C). With the existing BGP model, if RTR-A receives an external route
external route, it must advertise it to both RTR-B and RTR-C. RTR-B and it is selected as the best path it must advertise the external
and RTR-C (as IBGP speakers) will not re-advertise these IBGP learned route to both RTR-B and RTR-C. RTR-B and RTR-C (as IBGP speakers)
routes to other IBGP speakers. will not re-advertise these IBGP learned routes to other IBGP
speakers.
If this rule is broken and RTR-C is allowed to reflect IBGP learned If this rule is relaxed and RTR-C is allowed to reflect IBGP learned
routes, then it could re-distribute (or reflect) the IBGP routes routes, then it could re-advertise (or reflect) the IBGP routes
learned from RTR-A to RTR-B and vice versa. This would eliminate the learned from RTR-A to RTR-B and vice versa. This would eliminate the
need for the IBGP session between RTR-A and RTR-C as shown in Figure need for the IBGP session between RTR-A and RTR-C as shown in Figure
2 below. 2 below.
+------ + +-------+ +------ + +-------+
| | | | | | | |
| RTR-A | | RTR-B | | RTR-A | | RTR-B |
| | | | | | | |
+-------+ +-------+ +-------+ +-------+
\ / \ /
skipping to change at page 3, line 51 skipping to change at page 4, line 4
| | | | | | | |
+-------+ +-------+ +-------+ +-------+
\ / \ /
IBGP \ ASX / IBGP IBGP \ ASX / IBGP
\ / \ /
+-------+ +-------+
| | | |
| RTR-C | | RTR-C |
| | | |
+-------+ +-------+
Figure 2: Route Reflection IBGP Figure 2: Route Reflection IBGP
The Route Reflection scheme is based upon this principle.
The Route Reflection scheme is based upon this basic principle.
4. Terminology and Concepts 4. Terminology and Concepts
We use the term "Route Reflector" (RR) to represent an IBGP speaker We use the term "Route Reflector" (RR) to represent an IBGP speaker
that that participates in the reflection. The internal peers of a RR are
participates in the reflection. The internal peers of a RR are
divided into two groups: divided into two groups:
1) Client Peers 1) Client Peers
2) Non-Client Peers 2) Non-Client Peers
A RR reflects routes between these groups. A RR along with its A RR reflects routes between these groups. A RR along with its
client peers form a Cluster. The Non-Client peer must be fully meshed client peers form a Cluster. The Non-Client peer must be fully meshed
but the Client peers need not be fully meshed. The Client peers but the Client peers need not be fully meshed. The Client peers
should not peer with internal speakers outside of their cluster. should not peer with internal speakers outside of their cluster.
skipping to change at page 5, line 7 skipping to change at page 5, line 7
+-------+ +-------+ +-------+ +-------+
| RTR-D | IBGP | RTR-E | | RTR-D | IBGP | RTR-E |
| Non- |---------| Non- | | Non- |---------| Non- |
|Client | |Client | |Client | |Client |
+-------+ +-------+ +-------+ +-------+
Figure 3: RR Components Figure 3: RR Components
5. Operation 5. Operation
When a route is received by a RR, it must do the following depending When a route is received by a RR, it selects the best path based on
on the type of the peer it is receiving a route from: its path selection rule. After the best path is selected, it must do
the following depending on the type of the peer it is receiving the
best path from:
1) A Route from a Non-Client peer 1) A Route from a Non-Client peer
Reflect to all other Clients. Reflect to all other Clients.
2) A Route from a Client peer 2) A Route from a Client peer
Reflect to all the Non-Client peers and also to the Reflect to all the Non-Client peers and also to the
Client peers (Hence the Client peers are not required Client peers (Hence the Client peers are not required
to be fully meshed). to be fully meshed).
skipping to change at page 6, line 11 skipping to change at page 6, line 13
Usually a cluster of clients will have a single RR. In that case, the Usually a cluster of clients will have a single RR. In that case, the
cluster will be identified by the ROUTER_ID of the RR. However, this cluster will be identified by the ROUTER_ID of the RR. However, this
represents a single point of failure so to make it possible to have represents a single point of failure so to make it possible to have
multiple RRs in the same cluster, all RRs in the same cluster must be multiple RRs in the same cluster, all RRs in the same cluster must be
configured with a 4-byte CLUSTER_ID so that an RR can discern routes configured with a 4-byte CLUSTER_ID so that an RR can discern routes
from other RRs in the same cluster. from other RRs in the same cluster.
7. Avoiding Routing Information Loops 7. Avoiding Routing Information Loops
As IBGP learned routes are reflected, it is possible through mis- As IBGP learned routes are reflected, it is possible through mis-
configuration to form route redistribution loops. The Route configuration to form route re-distribution loops. The Route
Reflection method defines the following attributes to detect and Reflection method defines the following attributes to detect and
avoid routing information loops. avoid routing information loops.
ORIGINATOR_ID ORIGINATOR_ID
ORIGINATOR_ID is a new optional, non-transitive BGP attribute of Type ORIGINATOR_ID is a new optional, non-transitive BGP attribute of Type
code 9. This attribute is 4 bytes long and it will be created by a code 9. This attribute is 4 bytes long and it will be created by a
RR. This attribute will carry the ROUTER_ID of the originator of the RR. This attribute will carry the ROUTER_ID of the originator of the
route in the local AS. A BGP speaker should not create an route in the local AS. A BGP speaker should not create an
ORIGINATOR_ID attribute if one already exists If routing information ORIGINATOR_ID attribute if one already exists If routing information
comes back to the originator, it must be ignored. comes back to the originator, it must be ignored.
CLUSTER_LIST CLUSTER_LIST
Cluster-list is a new optional, non-transitive BGP attribute of Type Cluster-list is a new optional, non-transitive BGP attribute of Type
code 10. It is a sequence of CLUSTER_ID values representing the code 10. It is a sequence of CLUSTER_ID values representing the
reflection path that the route has passed. reflection path that the route has passed. It is encoded as follows:
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Attr. Flags |Attr. Type Code| Length | value ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Where Length is the number of octets.
When a RR reflects a route from its Clients to a Non-Client peer, it When a RR reflects a route from its Clients to a Non-Client peer, it
must append the local CLUSTER_ID to the CLUSTER_LIST. If the must append the local CLUSTER_ID to the CLUSTER_LIST. If the
CLUSTER_LIST is empty, it must create a new one. Using this attribute CLUSTER_LIST is empty, it must create a new one. Using this attribute
an RR can identify if the routing information is looped back to the an RR can identify if the routing information is looped back to the
same cluster due to mis-configuration. If the local CLUSTER_ID is same cluster due to mis-configuration. If the local CLUSTER_ID is
found in the cluster-list, the advertisement will be ignored. found in the cluster-list, the advertisement will be ignored.
8. Implementation and Configuration Considerations 8. Implementation and Configuration Considerations
skipping to change at page 7, line 20 skipping to change at page 7, line 20
Non-Clients. This could result is looping of routes. Non-Clients. This could result is looping of routes.
In some implementations, modification of the BGP path attribute, In some implementations, modification of the BGP path attribute,
NEXT_HOP is possible. For example, there could be a need for a RR to NEXT_HOP is possible. For example, there could be a need for a RR to
modify NEXT_HOP for EBGP learned routes sent to its internal peers. modify NEXT_HOP for EBGP learned routes sent to its internal peers.
However, this must not be possible for an RR to set on reflected IBGP However, this must not be possible for an RR to set on reflected IBGP
routes as this breaks the basic principle of Route Reflection and routes as this breaks the basic principle of Route Reflection and
will result in potential black holes. will result in potential black holes.
An RR should not modify any AS-PATH attributes (i.e. LOCAL_PREF, MED, An RR should not modify any AS-PATH attributes (i.e. LOCAL_PREF, MED,
DPA)that could change consistent route selection. THis could DPA)that could change consistent route selection. This could
resulting in potential loops. resulting in potential loops.
The BGP protocol provides no way for a Client to identify itself The BGP protocol provides no way for a Client to identify itself
dynamically as a Client to an RR configured BGP speaker and the dynamically as a Client to an RR configured BGP speaker and the
simplest way to achieve this is by manual configuration. simplest way to achieve this is by manual configuration.
9. Security 9. Security
Security considerations are not discussed in this memo. Security considerations are not discussed in this memo.
10. Acknowledgments 10. Acknowledgments
The authors would like to thank Dennis Ferguson, Enke Chen, Paul The authors would like to thank Dennis Ferguson, Enke Chen, John
Traina and Tony Li for the many discussions resulting in this work. Scudder, Paul Traina and Tony Li for the many discussions resulting
This idea was developed from an earlier discussion between Tony Li in this work. This idea was developed from an earlier discussion
and Dimitri Haskin. between Tony Li and Dimitri Haskin.
11. References 11. References
[1] Rekhter, Y., and Li, T., "A Border Gateway Protocol 4 (BGP-4)", [1] Rekhter, Y., and Li, T., "A Border Gateway Protocol 4 (BGP-4)",
RFC1771, March 1995. RFC1771, March 1995.
[2] Haskin, D., "A BGP/IDRP Route Server alternative to a full mesh [2] Haskin, D., "A BGP/IDRP Route Server alternative to a full mesh
routing", RFC1863, October 1995. routing", RFC1863, October 1995.
[3] Traina, P. "Limited Autonomous System Confederations for BGP", [3] Traina, P. "Limited Autonomous System Confederations for BGP",
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/