draft-ietf-idr-bgp4-16.txt   draft-ietf-idr-bgp4-17.txt 
Network Working Group Y. Rekhter Network Working Group Y. Rekhter
INTERNET DRAFT Juniper Networks INTERNET DRAFT Juniper Networks
T. Li T. Li
Procket Networks, Inc. Procket Networks, Inc.
Editors Editors
A Border Gateway Protocol 4 (BGP-4) A Border Gateway Protocol 4 (BGP-4)
<draft-ietf-idr-bgp4-16.txt> <draft-ietf-idr-bgp4-17.txt>
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
skipping to change at page 2, line 25 skipping to change at page 2, line 25
sections of the document borrowed heavily from IDRP [7], which is the sections of the document borrowed heavily from IDRP [7], which is the
OSI counterpart of BGP. For this credit should be given to the ANSI OSI counterpart of BGP. For this credit should be given to the ANSI
X3S3.3 group chaired by Lyman Chapin and to Charles Kunzinger who was X3S3.3 group chaired by Lyman Chapin and to Charles Kunzinger who was
the IDRP editor within that group. We would also like to thank Enke the IDRP editor within that group. We would also like to thank Enke
Chen, Edward Crabbe, Mike Craren, Vincent Gillet, Eric Gray, Jeffrey Chen, Edward Crabbe, Mike Craren, Vincent Gillet, Eric Gray, Jeffrey
Haas, Dimitry Haskin, John Krawczyk, David LeRoy, Dan Massey, Dan Haas, Dimitry Haskin, John Krawczyk, David LeRoy, Dan Massey, Dan
Pei, Mathew Richardson, John Scudder, John Stewart III, Dave Thaler, Pei, Mathew Richardson, John Scudder, John Stewart III, Dave Thaler,
Paul Traina, Russ White, Curtis Villamizar, and Alex Zinin for their Paul Traina, Russ White, Curtis Villamizar, and Alex Zinin for their
comments. comments.
Many thanks to Sue Hares for her contributions to the document, and
especially for her work on the BGP Finite State Machine.
We would like to specially acknowledge numerous contributions by We would like to specially acknowledge numerous contributions by
Dennis Ferguson. Dennis Ferguson.
2. Introduction 2. Introduction
The Border Gateway Protocol (BGP) is an inter-Autonomous System The Border Gateway Protocol (BGP) is an inter-Autonomous System
routing protocol. It is built on experience gained with EGP as routing protocol. It is built on experience gained with EGP as
defined in RFC 904 [1] and EGP usage in the NSFNET Backbone as defined in RFC 904 [1] and EGP usage in the NSFNET Backbone as
described in RFC 1092 [2] and RFC 1093 [3]. described in RFC 1092 [2] and RFC 1093 [3].
skipping to change at page 33, line 21 skipping to change at page 33, line 25
8. BGP Finite State machine. 8. BGP Finite State machine.
This section specifies BGP operation in terms of a Finite State This section specifies BGP operation in terms of a Finite State
Machine (FSM). Following is a brief summary and overview of BGP Machine (FSM). Following is a brief summary and overview of BGP
operations by state as determined by this FSM. operations by state as determined by this FSM.
Initially BGP is in the Idle state. Initially BGP is in the Idle state.
Idle state: Idle state:
A manual start event is a start event initiated by an operator.
An automatic start event is a start event generated by the
system.
In this state BGP refuses all incoming BGP connections. No In this state BGP refuses all incoming BGP connections. No
resources are allocated to the peer. In response to the Start resources are allocated to the peer. In response to a Start
event (initiated by either system or operator) the local system event (manual or automatic), the local system:
initializes all BGP resources, starts the ConnectRetry timer,
initiates a transport connection to other BGP peer, while
listening for connection that may be initiated by the remote
BGP peer, and changes its state to Connect. The exact value of
the ConnectRetry timer is a local matter, but should be
sufficiently large to allow TCP initialization.
If a BGP speaker detects an error, it shuts down the connection - initializes all BGP resources,
and changes its state to Idle. Getting out of the Idle state
requires generation of the Start event. If such an event is
generated automatically, then persistent BGP errors may result
in persistent flapping of the speaker. To avoid such a
condition it is recommended that Start events should not be
generated immediately for a peer that was previously
transitioned to Idle due to an error. For a peer that was
previously transitioned to Idle due to an error, the time
between consecutive generation of Start events, if such events
are generated automatically, shall exponentially increase. The
value of the initial timer shall be 60 seconds. The time shall
be doubled for each consecutive retry. An implementation MAY
impose a configurable upper bound on that time. Once the upper
bound is reached, the speaker shall no longer automatically
generate the Start event for the peer.
Any other event received in the Idle state is ignored. - starts the ConnectRetry timer,
Connect state: - initiates a transport connection to the other BGP peer,
In this state BGP is waiting for the transport protocol - listens for a connection that may be initiated by the
remote BGP peer, and
- changes its state to connect.
The exact value of the ConnectRetry timer is a local matter,
but it should be sufficiently large to allow TCP
initialization.
Any other event received in the IDLE state, is ignored.
IdleHold state:
The IdleHold state keeps the system in "Idle" mode until a
certain time period has passed or an operator intervenes to
manually restart the connection. This "IdleHold timeout"
prevents persistent flapping of a BGP peering session.
Upon entering the Idle Hold state, if the IdleHoldTimer exceeds
the local limit the "Keep Idle" flag is set.
Upon receiving a Manual start, the local system:
- clears the IdleHoldtimer,
- clears "keep Idle" flag
- initializes all BGP resources,
- starts the ConnectRetry timer,
- initiates a transport connection to the other BGP peer,
- listens for a connection that may be initiated by the
remote BGPPeer, and
- changes its state to connect.
Upon receiving a IdleHoldtimer expired event, the local system
checks to see that the Keep Idle flag is set. If the Keep Idle
flag is set, the system stays in the "Idle Hold" state.
If the Keep Idle flag is not set, the local system:
- clears the IdleHoldtimer,
- and transitions the state to Idle.
Getting out of the IdleHoldstate requires either operator
intervention via a manual start or the IdleHoldtimer to expire
with the "Keep Idle" flag to be clear.
Any other event received in the IdleHold state is ignored.
Connect State:
In this state, BGP is waiting for the transport protocol
connection to be completed. connection to be completed.
If the transport protocol connection succeeds, the local system If the transport connection succeeds, the local system:
clears the ConnectRetry timer, completes initialization, sends
an OPEN message to its peer, and changes its state to OpenSent.
If the transport protocol connect fails (e.g., retransmission - clears the ConnectRetry timer,
timeout), the local system restarts the ConnectRetry timer,
continues to listen for a connection that may be initiated by - completes initialization,
the remote BGP peer, and changes its state to Active state.
- send an Open message to its peer,
- set Hold timer to a large value, and
- changes its state to Open Sent.
A hold timer value of 4 minutes is suggested.
If the transport protocol connection fails (e.g.,
retransmission timeout), the local system:
- restarts the ConnectRetry timer,
- continues to listen for a connection that may be initiated
by the remote BGP peer, and
- changes its state to Active.
In response to the ConnectRetry timer expired event, the local In response to the ConnectRetry timer expired event, the local
system restarts the ConnectRetry timer, initiates a transport system:
connection to other BGP peer, continues to listen for a
connection that may be initiated by the remote BGP peer, and
stays in the Connect state.
The Start event is ignored in the Connect state. - restarts the ConnectRetry timer,
In response to any other event (initiated by either system or - initiates a transport connection to the other BGP peer,
operator), the local system releases all BGP resources
associated with this connection and changes its state to Idle.
Active state: - continues to listen for a connection that may be initiated
by the remote BGP peer, and
- stays in Connect state.
The start event (manual or automatic) is ignored in the Connect
state.
In response to any other event (initiated by the system or
operator), the local system:
- IdleHoldtimer = 2**(ConnectRetryCnt)*60
- Increment ConnectRetryCnt by 1,
- Set connect retry timer to zero,
- Drops TCP connection,
- Releases all BGP resources, and
- Goes to IdleHoldstate
Active State:
In this state BGP is trying to acquire a peer by listening for In this state BGP is trying to acquire a peer by listening for
and accepting a transport protocol connection. and accepting a transport protocol connection.
If the transport protocol connection succeeds, the local system If the transport connection succeeds, the local system:
clears the ConnectRetry timer, completes initialization, sends
an OPEN message to its peer, sets its Hold Timer to a large
value, and changes its state to OpenSent. A Hold Timer value of
4 minutes is suggested.
In response to the ConnectRetry timer expired event, the local - clears the ConnectRetry timer,
system restarts the ConnectRetry timer, initiates a transport
connection to the other BGP peer, continues to listen for a
connection that may be initiated by the remote BGP peer, and
changes its state to Connect.
If the local system allows BGP connections with unconfigured - completes the initialization,
peers, then when the local system detects that a remote peer is
trying to establish a BGP connection to it, and the IP address - sends the Open message to it's peer,
of the remote peer is not a configured one, the local system
creates a temporary peer entry, completes initialization, sends - sets its Hold timer to a large value,
an OPEN message to its peer, sets its Hold Timer to a large
value, and changes its state to OpenSent. - and changes its state to OpenSent.
A Hold timer value of 4 minutes is suggested.
In response the ConnectRetry timer expired event, the local
system:
- restarts the ConnectRetry timer,
- initiates a transport connection to the other BGP peer,
- continues to listen for connection that may be initiated
by remote BGP peer,
- and changes its state to Connect.
If the local system does not allow BGP connections with If the local system does not allow BGP connections with
unconfigured peers, then the local system rejects connections unconfigured peers, then the local system:
from IP addresses that are not configured peers, and remains in
the Active state.
The Start event is ignored in the Active state. - rejects connections from IP addresses that are not
configured peers,
In response to any other event (initiated by either system or - and remains in the Active state.
operator), the local system releases all BGP resources
associated with this connection and changes its state to Idle.
OpenSent state: The start events (initiated by the system or operator) are
ignored in the Active state.
In this state BGP waits for an OPEN message from its peer. In response to any other event (initiated by the system or
When an OPEN message is received, all fields are checked for operator), the local system:
correctness. If the BGP message header checking or OPEN message
checking detects an error (see Section 6.2), or a connection
collision (see Section 6.8) the local system sends a
NOTIFICATION message and changes its state to Idle.
If there are no errors in the OPEN message, BGP sends a - IdleHoldtimer = 2**(ConnectRetryCnt)*60
KEEPALIVE message and sets a KeepAlive timer. The Hold Timer,
which was originally set to a large value (see above), is - Increment ConnectRetryCnt by 1,
replaced with the negotiated Hold Time value (see section 4.2).
If the negotiated Hold Time value is zero, then the Hold Time - Set connect retry timer to zero, and
timer and KeepAlive timers are not started. If the value of the
Autonomous System field is the same as the local Autonomous - Drops TCP connection,
- Releases all BGP resources,
- Goes to IdleHold state.
Open Sent:
In this state BGP waits for an Open Message from its peer.
When an OPEN message is received, all fields are check for
correctness. If the BGP message header checking or OPEN
message check detects an error (see Section 6.2), or a
connection collision (see Section 6.8) the local system:
- sends a NOTIFICATION message
- IdleHoldtimer = 2**(ConnectRetryCnt)*60
- Increment ConnectRetryCnt by 1,
- Set connect retry timer to zero, and
- Drops TCP connection,
- Releases all BGP resources,
- Goes to IdleHold state.
If there are no errors in the OPEN message, the local system:
- sends a KEEPALIVE message and
- sets a KeepAlive timer (via the text below)
- set the Hold timer according to the negotiated value (see
section 4.2),
- set the state to Open Confirm.
If the negotiated Hold time value is zero, then the Hold Time
timer and KeepAlive timers are not started. If the value of
the Autonomous System field is the same as the local Autonomous
System number, then the connection is an "internal" connection; System number, then the connection is an "internal" connection;
otherwise, it is "external". (This will affect UPDATE otherwise, it is an "external" connection. (This will impact
processing as described below.) Finally, the state is changed UPDATE processing as described below.)
to OpenConfirm.
If a disconnect notification is received from the underlying If a disconnect NOTIFICATION is received from the underlying
transport protocol, the local system closes the BGP connection, transport protocol, the local system:
restarts the ConnectRetry timer, while continue listening for
connection that may be initiated by the remote BGP peer, and
goes into the Active state.
If the Hold Timer expires, the local system sends NOTIFICATION - closes the BGP connection,
message with error code Hold Timer Expired and changes its
state to Idle.
In response to the Stop event (initiated by either system or - restarts the Connect Retry timer,
operator) the local system sends NOTIFICATION message with
Error Code Cease and changes its state to Idle.
The Start event is ignored in the OpenSent state. - and continues to listen for a connection that may be
initiated by the remote BGP peer, and goes into Active
state.
In response to any other event the local system sends If the Hold Timer expires, the local system:
NOTIFICATION message with Error Code Finite State Machine Error
and changes its state to Idle.
Whenever BGP changes its state from OpenSent to Idle, it closes - send a NOTIFICATION message with error code Hold Timer
the BGP (and transport-level) connection and releases all Expired,
resources associated with that connection.
OpenConfirm state: - IdleHoldtimer = 2**(ConnectRetryCnt)*60
- Increment ConnectRetryCnt by 1,
- Set connect retry timer to zero, and
- Drops TCP connection,
- Releases all BGP resources, and
- Goes to IdleHold state.
The Start event (manual and automatic) is ignored in the
OpenSent state.
If a NOTIFICATION message is received with a version error, the
local system:
- Closes the transport connection
- Releases BGP resources,
- ConnectRetryCnt = 0,
- Connect retry timer = 0, and
- transition to Idle state.
If any other NOTIFICATION is received, the local system:
- IdleHoldtimer = 2**(ConnectRetryCnt)*60
- Increment ConnectRetryCnt by 1,
- Set connect retry timer to zero, and
- Drops TCP connection,
- Releases all BGP resources,
- Goes to IdleHold state.
In response to any other event, the local system:
- sends the NOTFICATION message with Error Code Finite State
Machine Error,
- IdleHoldtimer = 2**(ConnectRetryCnt)*60
- Increment ConnectRetryCnt by 1,
- Set connect retry timer to zero,
- Drops TCP connection,
- Releases all BGP resources, and
- Goes to IdleHold state.
Open Confirm State
In this state BGP waits for a KEEPALIVE or NOTIFICATION In this state BGP waits for a KEEPALIVE or NOTIFICATION
message. message.
If the local system receives a KEEPALIVE message, it changes If the local system receives a KEEPALIVE message, it changes
its state to Established. its state to Established.
If the Hold Timer expires before a KEEPALIVE message is If the Hold Timer expires before a KEEPALIVE message is
received, the local system sends NOTIFICATION message with received, the local system:
error code Hold Timer Expired and changes its state to Idle.
If the local system receives a NOTIFICATION message, it changes - send the NOTIFICATION message with the error code Hold
its state to Idle. Timer Expired,
If the KeepAlive timer expires, the local system sends a - sets IdleHoldTimer = 2**(ConnectRetryCnt)*60
KEEPALIVE message and restarts its KeepAlive timer. - Increments ConnectRetryCnt by 1,
If a disconnect notification is received from the underlying - Sets the connect retry timer to zero,
transport protocol, the local system changes its state to Idle.
In response to the Stop event (initiated by either system or - Drop the TCP connection,
operator) the local system sends NOTIFICATION message with
Error Code Cease and changes its state to Idle. - Releases all BGP resources,
- Goes to IdleHoldState.
If the local system receives a NOTIFICATION message or receives
a disconnect NOTIFICATION from the underlying transport
protocol, the local system:
- Sets IdleHold Timer = 2**(ConnectRetryCnt)*60
- Increments ConnectRetryCnt by 1,
- Sets the connect retry timer to zero,
- Drops the TCP connection,
- Releases all BGP resources,
- Goes to IdleHoldstate.
In response to the Stop event initiated by the system, the
local system:
- sends the NOTIFICATION message with Cease,
- sets IdleHoldtimer = 2**(ConnectRetryCnt)*60
- Increments ConnectRetryCnt by 1,
- Sets the Connect retry timer to zero,
- Drops the TCP connection,
- Releases all BGP resources,
- Goes to IdleHoldstate.
In response to a Stop event initiated by the operator, the
local system:
- sends the NOTIFICATION message with Cease,
- releases all BGP resources
- sets the ConnectRetryCnt to zero
- sets the connect retry timer to 0
- transitions to Idle state.
The Start event is ignored in the OpenConfirm state. The Start event is ignored in the OpenConfirm state.
In response to any other event the local system sends In response to any other event, the local system:
NOTIFICATION message with Error Code Finite State Machine Error
and changes its state to Idle.
Whenever BGP changes its state from OpenConfirm to Idle, it - sends a NOTIFICATION with a code of Finite State Machine
closes the BGP (and transport-level) connection and releases Error,
all resources associated with that connection.
Established state: - sets IdleHoldtimer = 2**(ConnectRetryCnt)*60
In the Established state BGP can exchange UPDATE, NOTIFICATION, - Increments ConnectRetryCnt by 1,
- Sets the Connect retry timer to zero,
- Drops the TCP connection,
- Releases all BGP resources,
- Goes to IdleHoldstate.
Established State:
In the Established state BGP can exchange UPDATE, NOTFICATION,
and KEEPALIVE messages with its peer. and KEEPALIVE messages with its peer.
If the local system receives an UPDATE or KEEPALIVE message, it If the local system receives an UPDATE or KEEPALIVE message, it
restarts its Hold Timer, if the negotiated Hold Time value is restarts its Hold Timer, if the negotiated Hold Time value is
non-zero. non-zero.
If the local system receives a NOTIFICATION message, it changes If the local system receives a NOTIFICATION message or a
its state to Idle. disconnect from the underlying transport protocol, it:
If the local system receives an UPDATE message and the UPDATE - sets IdleHoldtimer = 2**(ConnectRetryCnt)*60,
message error handling procedure (see Section 6.3) detects an
error, the local system sends a NOTIFICATION message and
changes its state to Idle.
If a disconnect notification is received from the underlying - Increments ConnectRetryCnt by 1,
transport protocol, the local system changes its state to Idle.
If the Hold Timer expires, the local system sends a - Sets the Connect retry timer to zero,
NOTIFICATION message with Error Code Hold Timer Expired and
changes its state to Idle. - Drops the TCP connection,
- Releases all BGP resources, and
- Goes to IdleHoldstate.
If the local system receives an UPDATE message, and the Update
message error handling procedure (see Section 6.3) detecs an
error, the local system:
- sends a NOTIFICATION message with Update error,
- sets IdleHoldtimer = 2**(ConnectRetryCnt)*60
- Increments ConnectRetryCnt by 1,
- Sets the Connect retry timer to zero,
- Drops the TCP connection,
- Releases all BGP resources, and
- Goes to IdleHoldstate.
If the Hold timer expires, the local system:
- sends a NOTIFICATION message with Error Code Hold Timer
Expired,
- sets IdleHoldtimer = 2**(ConnectRetryCnt)*60
- Increments ConnectRetryCnt by 1,
- Sets the connect retry timer to zero,
- Drops the TCP connection,
- Releases all BGP resources,
- Goes to IdleHold state.
If the KeepAlive timer expires, the local system sends a If the KeepAlive timer expires, the local system sends a
KEEPALIVE message and restarts its KeepAlive timer. KEEPALIVE message, it restarts its KeepAlive timer, unless the
negotiated Hold Time value is zero.
Each time the local system sends a KEEPALIVE or UPDATE message, Each time time the local system sends a KEEPALIVE or UPDATE
it restarts its KeepAlive timer, unless the negotiated Hold message, it restarts its KeepAlive timer, unless the negotiated
Time value is zero. Hold Time value is zero.
In response to the Stop event (initiated by either system or In response to the Stop event initiated by the system
operator), the local system sends a NOTIFICATION message with (automatic), the local system:
Error Code Cease and changes its state to Idle.
- sends a NOTIFICATION with Cease,
- sets IdleHoldtimer = 2**(ConnectRetryCnt)*60
- increments ConnectRetryCnt by 1,
- sets the connect retry timer to zero,
- drops the TCP connection,
- releases all BGP resources,
- goes to IdleHold state, and
- deletes all routes.
An example automatic stop event is exceeding the number of
prefixes for a given peer and the local system automatically
disconnecting the peer.
In response to a stop event initiated by an operator:
- release all resources (including deleting all routes),
- set ConnectRetryCnt to zero (0),
- set connect retry timer to zero (0), and
- transition to the Idle.
The Start event is ignored in the Established state. The Start event is ignored in the Established state.
In response to any other event, the local system sends In response to any other event, the local system:
NOTIFICATION message with Error Code Finite State Machine Error
and changes its state to Idle.
Whenever BGP changes its state from Established to Idle, it - sends a NOTIFICATION message with Error Code Finite State
closes the BGP (and transport-level) connection, releases all Machine Error,
resources associated with that connection, and deletes all
routes derived from that connection. - sets IdleHoldtimer = 2**(ConnectRetryCnt)*60
- increments ConnectRetryCnt by 1,
- sets the connect retry timer to zero,
- drops the TCP connection,
- releases all BGP resources
- goes to IdleHoldstate, and
- deletes all routes.
9. UPDATE Message Handling 9. UPDATE Message Handling
An UPDATE message may be received only in the Established state. An UPDATE message may be received only in the Established state.
When an UPDATE message is received, each field is checked for When an UPDATE message is received, each field is checked for
validity as specified in Section 6.3. validity as specified in Section 6.3.
If an optional non-transitive attribute is unrecognized, it is If an optional non-transitive attribute is unrecognized, it is
quietly ignored. If an optional transitive attribute is unrecognized, quietly ignored. If an optional transitive attribute is unrecognized,
the Partial bit (the third high-order bit) in the attribute flags the Partial bit (the third high-order bit) in the attribute flags
skipping to change at page 44, line 37 skipping to change at page 50, line 42
remove m from consideration remove m from consideration
In the pseudo-code above, cost(n) is a function which returns the In the pseudo-code above, cost(n) is a function which returns the
cost of the path (interior distance) to the address given in the cost of the path (interior distance) to the address given in the
NEXT_HOP attribute of the route. NEXT_HOP attribute of the route.
f) Remove from consideration all routes other than the route that f) Remove from consideration all routes other than the route that
was advertised by the BGP speaker whose BGP Identifier has the was advertised by the BGP speaker whose BGP Identifier has the
lowest value. lowest value.
g) Prefer the route received from the lowest neighbor address.
9.1.3 Phase 3: Route Dissemination 9.1.3 Phase 3: Route Dissemination
The Phase 3 decision function shall be invoked on completion of Phase The Phase 3 decision function shall be invoked on completion of Phase
2, or when any of the following events occur: 2, or when any of the following events occur:
a) when routes in the Loc-RIB to local destinations have changed a) when routes in the Loc-RIB to local destinations have changed
b) when locally generated routes learned by means outside of BGP b) when locally generated routes learned by means outside of BGP
have changed have changed
c) when a new BGP speaker - BGP speaker connection has been c) when a new BGP speaker - BGP speaker connection has been
established established
The Phase 3 function is a separate process which completes when it The Phase 3 function is a separate process which completes when it
has no further work to do. The Phase 3 Routing Decision function has no further work to do. The Phase 3 Routing Decision function
shall be blocked from running while the Phase 2 decision function is shall be blocked from running while the Phase 2 decision function is
in process. in process.
All routes in the Loc-RIB shall be processed into Adj-RIBs-Out All routes in the Loc-RIB shall be processed into Adj-RIBs-Out
according to configured policy. This policy may exclude a route in according to configured policy. This policy may exclude a route in
the Loc-RIB from being installed in a particular Adj-RIB-Out. A the Loc-RIB from being installed in a particular Adj-RIB-Out. A
route shall not be installed in the Adj-Rib-Out unless the route shall not be installed in the Adj-Rib-Out unless the
destination and NEXT_HOP described by this route may be forwarded destination and NEXT_HOP described by this route may be forwarded
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/