Network Working Group                                Sira Panduranga Rao
Internet Draft                                                       UTA
Expiration Date: July September 2001                               Alex Zinin
File name: draft-ietf-ospf-dc-00.txt draft-ietf-ospf-dc-01.txt                           Abhay Roy
                                                           Cisco Systems

                                                           November 2000

                                                              March 2001

         Detecting Inactive Neighbors over OSPF Demand Circuits
                       draft-ietf-ospf-dc-00.txt
                       draft-ietf-ospf-dc-01.txt

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet Drafts are working documents of the Internet Engineering
   Task Force (IETF), its Areas, and its Working Groups. Note that other
   groups may also distribute working documents as Internet Drafts.

   Internet Drafts are draft documents valid for a maximum of six
   months. Internet Drafts may be updated, replaced, or obsoleted by
   other documents at any time. It is not appropriate to use Internet
   Drafts as reference material or to cite them other than as a "working
   draft" or "work in progress".

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

Abstract
   OSPF [RFC2328] is a link-state intra-domain routing protocol used in
   IP networks. OSPF behavior over demand circuits is optimized in
   [RFC1793] to minimize the amount of overhead traffic. A part of OSPF
   demand circuit extensions is the Hello suppression mechanism. This
   technique allows a demand circuit to go down when no interesting
   traffic is going through the link. However, it also introduces a
   problem, where it becomes impossible to detect a OSPF-inactive
   neighbor over such a link. This memo addresses the above problem by
   introducing three mechanisms---Hello probes, limitation two mechanisms---limitation of the number of LSA retransmits
   retransmits, and flushing of self-originated LSAs. neighbor probing.

1. Motivation

   In some situations, when operating over demand circuits, the remote
   neighbor may be unable to run OSPF, and, as a possible result, unable
   to route application traffic. Possible scenarios include:

   o    The OSPF process might have died on the remote neighbor.

   o    Oversubscription (Section 7 of [RFC1793]) may cause a continuous
        drop of application data at the link level.

   The problem here is that the local router cannot identify the prob-
   lems such as this, since Hello exchange is suppressed on demand cir-
   cuits.  If the topology of the network is such that other routers
   cannot communicate their knowledge about the remote neighbor via
   flooding, the local router and all routers behind it will never know
   about the problem, so application traffic may continue being for-
   warded to the OSPF-incapable router.

   This memo describes two techniques that solve the described problem.
   First, a neighbor probing mechanism using Hellos is introduced, and
   second, the number of LSA retransmit attempts on demand circuits is
   limited.  We also encourage flushing This method addresses most of self-originated LSAs when the
   OSPF process network scenarios, but alone
   is going down. not enough to cover all possible cases, so we extend it by intro-
   ducing a backward-compatible neighbor probing mechanism.

2. Proposed Solution

   The first part of the solution is limiting the number of times LSAs
   can be retransmitted over a demand circuit. The fact that LSAs are
   not acknowledged by the remote router is used to detect the fact that
   the neighbor is not reachable any more. See Section 2.1 for more
   details.

   The second part of the solution this document proposes makes use of
   Hellos uses LSA
   update packets to detect whether the OSPF process is operational on
   the remote neighbor. We call this process "Hello "Neighbor probing".  The
   idea behind this technique is to allow either of the two neighbors con-
   nected
   connected over a demand circuit to test the remote neighbor at any
   time (see Section 2.1.2). 2.2).

   The routers across the demand circuit can be connected by either a
   point-to-point link, or a virtual link, or a point-to-multipoint
   interface. The case of routers connected by broadcast networks or
   NBMA is not considered, since Hello suppression is not used in these
   cases (Section 3.2 [RFC1793]).  Since Hellos are
   suppressed on demand circuit interfaces, the local router must make
   sure the remote router supports Hello probing before testing it. Oth-
   erwise the remote router may be mistakenly declared inoperational. To
   accomplish this, we introduce a new capability bit that is exchanged
   in DBD packets (see Section 2.1.1).

   The Hello neighbor probing mechanism is used as follows.  After a router
   has synchronized the LSDB with its neighbor over the demand circuit,
   the demand circuit may be torn down if there is no more application
   traffic.  When application traffic starts going over the link, the
   link is brought up, and the routers may probe each other. The routers
   may also probe each other any time the link is up (could be imple-
   mented as a configurable option) with the caution that OSPF Hello packets
   sent as a part of neighbor probing are not considered as interesting
   traffic and do not cause the demand circuit to remain up. up (relevant
   details of implementation are outside of the scope of this document).

   The case when one or more of the router's links are oversubscribed
   (see section 7 of [RFC1793]) should be considered by the implementa-
   tions. In such a situation even if the link status is up and applica-
   tion data being sent on the link, only a limited number of neighbors
   is really reachable. To make sure temporarily unreachable neighbors
   are not mistakenly declared down, Hello Neighbor probing should be restricted res-
   tricted to those neighbors that are actually reachable (i.e., there
   is a cir-
   cuit circuit established with the neighbor at the moment the probing pro-
   cedure
   procedure needs to be initiated). This check itself is considered also con-
   sidered an implementation detail.

   The second part

 2.1 Limiting Number of LSA Retransmissions

   In the solution presence of LSAs that need to be flooded through a demand-
   circuit link, it is possible to identify OSPF-incapable neighbors by
   limiting the number amount of times LSAs
   can be retransmitted LSA retransmits over a demand circuit. See Section 2.2 for more
   details. The third part of
   router should count the solution is flushing number of self-originated LSAs
   whenever the OSPF process on retransmit attempts for each neigh-
   bor reachable through a router is going down.

   Hello probing and demand-circuit link. When an LSA retransmission limit may be used together or
   alone.  This memo does not dictate which one and how many of them
   must be implemented, but only provides mechanisms to solve is ack-
   nowledged by the
   described problem. This memo, however, recommends to flush some
   locally originated LSAs when possible when OSPF process is going
   down.

 2.1 Hello Probing

   The Hello probing mechanism allows routers connected over a demand
   circuit to test each other's OSPF capabilities. In order to do so,
   both routers need to support this functionality, otherwise opera-
   tional routers may mistakenly be declared unreachable. We insure this
   by introducing a new capability bit in neighbor, the Extended Options TLV
   announced in router should zero the link-local signaling (LLS) data block of DBD packets
   (see [LLS] for more information on LLS).

   We also use counter. When
   the same bit in Hello packets as counter reaches a Hello reply request
   (RR) flag.  This helps avoid racing conditions when predefined (or configured) value, a Hello sent in
   reply causes another reply to KillNbr
   event should be sent, and so on. When a router needs
   to probe its neighbor, it sends a Hello with generated for the RR bit set. The
   receiving side sends a Hello packet in reply with RR bit clear.

  2.1.1 Extended Options TLV

   The Extended Options TLV is a part of LLS specification (see [LLS])
   and is announced in both Hello and DBD packets.

   A new bit is introduced in neighbor experiencing the Value field of problem.

   Note that this TLV as shown in
   Figure 1. The value method does not require cooperation of the bit is 0x00000004.

      +---+---+---+---+---+---+---+- -+---+---+---+---+-----+---+---+
      | * | * | * | * | * | * | * |...| * | * | * | * |HP/RR| RS| OR|
      +---+---+---+---+---+---+---+- -+---+---+---+---+-----+---+---+

                  Figure 1. Bits in Extended Options TLV

   When used in DBD packets, the new bit indicates router's Hello Prob-
   ing capability routers on
   both sides of a demand circuit and is called the HP-bit.  When can be used in Hello packets,
   the new bit means that a Hello must with already installed
   OSPF routers without requiring them to be sent upgraded with new software.
   This method has been implemented by Cisco Systems in reply 1998, has been
   widely deployed, and is called
   the Reply Request (RR) bit.

   Routers supporting Hello probing must always set has proven its validity.

 2.2 Neighbor Probing

   Because the HP bit method described in their
   DBD packets.

   For description of RS and OR bits, see [HELLO] and [OOB] correspond-
   ingly.

  2.1.2 Hello Probing Procedure

   OSPF routers are allowed to perform Hello probing at any time. How-
   ever, it Section 2.1 is not recommended sufficient to do so
   cover the situation when the link topology of the network is down, because,
   in its one extreme, it will keep stable and
   the demand circuit up or bouncing,
   or, in link does not change its other extreme, it may cause state (and hence no LSAs
   are flooded through the demand circuit), there must be a neighbor mechanism to be mistakenly
   declared unreachable.

   It is recommended that both sides perform Hello
   explicitly verify neighbor's OSPF capability.

   The neighbor probing whenever the
   demand circuit goes up, and periodically if the circuit stays method described in the
   active state. Note however this section is completely
   compatible with standard OSPF implementations, because it is based on
   standard behavior that care must be taken not to let followed by OSPF
   Hello probes keep the circuit implementations in the active state without any appli-
   cation traffic going through it.
   order to keep their LSDBs synchronized.

   When a router needs to probe verify OSPF capability of a neighbor, neighbor reachable
   through a demand circuit, it should start its Hello
   and Dead timers and send Hello packets with the RR-bit set. If asso-
   ciated interface is point-to-multipoint, it is recommended flood to account
   for neighbor-specific timers and send Hello probes as IP unicasts.
   On the receiving side, when a packet with the RR-bit set is received,
   the router should immediately reply with a unicast Hello packet
   without setting the RR-bit. Unicast Hello limits the scope of Hello
   probing.

   The described procedure makes it possible for the sides to probe
   their corresponding neighbors asynchronously and without coordina-
   tion.

 2.2 Limiting Number of neighbor any LSA Retransmissions

   An alternative method (that can in
   its LSDB that would normally be used together with Hello probing)
   to identify OSPF-incapable neighbors is sent to limit the amount of LSA
   retransmits over a demand circuit. The router should count neighbor during the number
   of retransmit attempts for each neighbor. When an ini-
   tial LSDB synchronization process (it most cases such LSA is acknowledged
   by the neighbor, the router should zero the counter. When the counter
   reaches a predefined (or configured) value, a KillNbr event should be
   generated for must have
   already been flooded to the neighbor experiencing by the problem.

   Note that this method does not require cooperation of time the probing pro-
   cedure starts). For example, the routers on
   both sides of router may flood its own router-LSA
   (without originating a demand circuit new version), or the neighbor's own router-
   LSA. If the neighbor is still alive and can be used with already installed
   OSPF routers without requiring them to be upgraded OSPF-capable, it replies with new software.

 2.3 OSPF Process Shutdown
   a link state acknowledgement and Flushing of LSAs

   It the LSA is recommended for an OSPF process to flush its self-originated
   LSAs when removed from the OSPF process
   neighbor's retransmission list. If no acknowledgement is going down. This way received,
   the router informs
   all other routers mechanism described in Section 2.1 brings the area adjacency down.

   Note that they should when the neighbor being probed receives such a link state
   update packet, it acknowledges the LSA but does not consider flood it tran-
   sit any more and should look for alternative routes.

   Care must be taken not to introduce instability in
   further synce received copy of the network by
   flushing all LSAs. It LSA is acceptable concidered to flush only the self-originated
   router-LSA in the appropriate area and let other LSAs age out.

   Note that there can happen situations where the router cannot reli-
   ably flush its LSAs within reasonable time frame. This could be due
   to the loss same
   as the neighbor's database copy. Because of this property, the packets, link
   state update based neighbor probing mechanism is localized to the
   demand circuit being down or the
   delay and does not increase flooding in establishing a path to the neighbor. A situation highlight-
   ing this problem is when area.

   Again, the router is oversubscribed (see Section 7
   of [RFC1793]) and thus cannot communicate implementation should insure (through internal mechanisms)
   that OSPF link state update packets sent over the news to its neighbors. demand circuit for
   the purpose of neighbor probing do not prevent that circuit from
   being torn down.

3. Support of Virtual Links and Point-to-multipoint Interfaces

   Virtual links can be treated analogous to point-to-point links and so
   the techniques described in this memo are applicable to virtual links
   as well.  The case of point-to-multipoint interface running as demand
   circuit (section 3.5 [RFC1793]) can be treated as individual point-
   to-point links, for which the solution has been described in section
   2.

4. Compatibility issues

   Backward compatibility of the Hello probing mechanism is insured by
   introducing the HP bit

   All mechanisms described in the Extended Options TLV.

   Limiting the number of LSA retransmission is a backward-compatible
   technique by its nature. this document are completely backward-
   compatible.

5. Considerations

   In addition to the lost functionality mentioned in Section 6 of
   [RFC1793], there is an added overhead in terms of the amount of data
   (hello packets)
   (link state updates and acknowledgements) being transmitted due to Hello
   neighbor probing whenever the link is up and thereby increasing the
   overall cost.

6. Acknowledgements
   The authors would like to thank John Moy, Vijayapal Reddy Patil, SVR
   Anand, and Peter Psenak for their comments on this work.

   A significant portion of Sira's work was carried out as part of the
   HFCL-IISc Research Project (HIRP), Bangalore, India. He would like to
   thank the team for their insightful discussions.

7. References

[RFC2328]
     J.Moy, OSPF Version 2. Technical Report RFC2328 Internet Engineer-
     ing Task Force, 1998 ftp://ftp.isi.edu/in-notes/rfc2328.txt

[RFC1793]
     J.Moy, Extending OSPF to support Demand Circuits.  Technical Report
     RFC1793 Internet Engineering Task Force, 1995
     ftp://ftp.isi.edu/in-notes/rfc1793.txt

[LLS] Zinin, Friedman, Roy, Nguyen, Yeung, "OSPF Link-local Signaling",
     draft-ietf-ospf-lls-00.txt, Work in progress.

[HELLO]
      Zinin, Roy, Nguyen, "OSPF Restart Signaling", draft-ietf-ospf-
     restart-00.txt, Work in progress.

[OOB] Zinin, Roy, Nguyen, "OSPF Out-of-band LSDB resynchronization",
     draft-ietf-ospf-oob-resync-00.txt, Work in progress.

8. Authors' addresses

   Sira Panduranga Rao                       Alex Zinin
   The University of Texas at Arlington      Cisco Systems
   Arlington, TX 76013
   Email: siraprao@hotmail.com

   Alex Zinin
   Cisco Systems                       150 West Tasman Dr.
   Email: siraprao@hotmail.com               San Jose, CA 95134
                                             Email: azinin@cisco.com

   Abhay Roy
   Cisco Systems
   170 W. Tasman Dr.
   San Jose,CA 95134
   USA
   E-mail: akr@cisco.com