MPLS
Network Working Group                                           B. Black
Internet-Draft
Internet Draft                                           Layer8 Networks
Expires: December 30, 2002
Updates: 3036                                                K. Kompella
Category: Standards Track                               Juniper Networks
                                                            July 1, 2002
Expires: December 2003                                         June 2003

                   MTU Signalling Extensions for LDP
                 draft-ietf-mpls-ldp-mtu-extensions-00
               draft-ietf-mpls-ldp-mtu-extensions-01.txt

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at http://
   www.ietf.org/ietf/1id-abstracts.txt.
              http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
              http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on December 30, 2002.

Copyright Notice

   Copyright (C) The Internet Society (2002). (2003).  All Rights Reserved.

Abstract

   Proper functioning of RFC 1191 path MTU detection discovery requires that IP
   routers have knowledge of the MTU for each link to which they are
   connected.  As currently specified, LDP the Label Distribution Protocol
   (LDP) does not have the ability to signal the MTU for an LSP a Label
   Switched Path (LSP) to the ingress LSRs. Label Switching Router (LSR).  In
   the absence of this functionality, the MTU for each LSP must be
   statically configured by network operators or by equivalent, off-line
   mechanisms.

   This document specifies extensions to the LDP label distribution
   protocol in support of LSP MTU signalling.
   discovery.

Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC-2119 [1].

1. Introduction

   As currently specified in [3], [2], the LDP protocol for MPLS does not
   support signalling of the MTU for LSPs to ingress LSRs.  This
   functionality is essential to the proper functioning of RFC 1191 path
   MTU detection [1]. [3].  Without knowledge of the MTU for an LSP, edge
   LSRs may transmit packets along that LSP which are, according to [4],
   too big.  Such packets may be silently discarded by LSRs along the
   LSP, effectively preventing communication between certain end hosts.

   The solution proposed in this document enables automatic
   determination of the MTU for an LSP with the addition of a TLV to
   carry MTU information for a FEC between adjacent LSRs in LDP Label
   Mapping messages.  This information is sufficient for a set of LSRs
   along the path followed by an LSP to discover either the exact MTU
   for that LSP, or an approximation which is no worse than could be
   generated with local information on the ingress LSR.

1.1. Changes from last version

   The biggest change, protocol-wise is that the notion of 'egress'
   interface has been removed.  The LSP MTU at the egress is now 65535.
   This has repurcussions on the processing of the MTU TLV.  Also, the
   MTU TLV now has both the U and F bits set.

   A number of definitions have been introduced to clarify the
   exposition.  Also, the examples have been changed significantly.

2. MTU Signalling

   The signalling procedure described in this document employs the
   addition of a single TLV to LDP Label Mapping messages and a simple
   algorithm for LSP MTU calculation.

2.1 Signalling Procedure

   The procedure for signalling

2.1. Definitions

   Link MTU: the MTU is performed hop-by-hop by each
   LSR L along of a given link.  This size includes the IP header
   and data (or other payload) and the label stack, but does not include
   any lower-level headers.  A link may be an LSP.  The steps are interface (such as follows:

   1.  First, L computes the MTU
   Ethernet or Packet-over-SONET), a tunnel (such as GRE or IPsec) or an
   LSP.

   Peer LSRs: for each FEC:

       1.  If L is the egress LSR for A and FEC F, this is the FEC, L set the MTU of LSRs that sent a
   Label Mapping for FEC F to A.

   Downstream LSRs: for LSR A and FEC F, this is the MTU subset of A's peer
   LSRs for FEC F to whom A will forward packets for the egress interface, unless local policy specifies
           otherwise.

       2.  If L FEC.
   Typically, this subset is not the egress LSR for determined via the FEC, L SHOULD set routing table.

   Hop MTU: the MTU
           to 0xffff, indicating that it is not the egress of an LSP hop between an upstream LSR A and has
           not yet received an MTU other than 0xffff from a
   downstream
           LSRs.  Local policy may dictate LSR B.  This size includes the selection of a value IP header and data (or
   other than 0xffff, but the default in payload) and the absence part of such
           policy should be 0xffff.

       3.  If L the label stack that is considered
   payload as far as this LSP goes.  It does not include any lower-level
   headers.  (Note: if there are multiple links between A and B, the egress LSR Hop
   MTU is the minimum of the MTU of those links used for forwarding.)

   LSP MTU: the MTU of an LSP from a FEC, given LSR to the egress(es), over
   each valid (forwarding) path.  This size includes the IP header and
   data (or other payload) and any part of the label stack that was
   received by the ingress LSR before it placed the packet into the LSP
   (this part of the label stack is considered part of the payload for
   this LSP).  The size does not include any lower-level headers.

2.2. Example

   Consider LSRs A-F interconnected as follows:

                 M       P
               _____ C =====
              /      |      \
     A ~~~~~ B ===== D ----- E ----- F
         L receives a
           Mapping       N       Q       R
   Say that the link MTU for link L is 9216, for links M, Q and R is
   4470, and for N and P is 1500.

   Consider a FEC X for which includes an MTU TLV with a value
           other than 0xffff, L calculates F is the egress, and say that all LSRs
   advertise X to their neighbors.

   C's peers for FEC X are B, D and E.  Say C chooses E as its
   downstream LSR for X.  Similarly, A chooses B, B chooses C and D, D
   chooses E and E chooses F (respectively) as their downstream LSRs.

   C's Hop MTU according to the
           rules in Section 2.2. E for FEC X is 1496.  B's Hop MTU to C is 4466, and to
   D is 1496.  A's LSP MTU for FEC X is 1496.  If L receives multiple Mapping
           messages A has another LSP for this FEC, it first chooses between them by some
           policy, then applies
   FEC Y to F (learned via targetted LDP) that rides over the calculation LSP for
   FEC X, the chosen Mapping.
           This is the "active Mapping" MTU for this FEC.

       4. that LSP would be 1492.

   If L receives B had a targetted LDP session to E over which B received a Mapping
   for a FEC without an MTU TLV from a
           directly connected neighbor, L MAY act X, then E would also be B's peer, and E may be chosen as if it received an
           MTU TLV with a
   downstream LSR for B.

   This memo describes how A determines its LSP MTU 0xffff, for FEC X and follow the Y.

2.3. Signalling Procedure

   The procedure in Step
           1.2.  Otherwise, for signalling the MTU is performed hop-by-hop by each
   LSR L MUST send Mappings along an LSP for this a given FEC without an F.  The steps are as follows:

   1.  First, L computes the its LSP MTU TLV.

       5. for FEC F:

       A.  If L receives a Mapping is the egress for a FEC from a peer F, L sets the LSP MTU for F to which it 65535.

       B.  If L is not directly connected, it must first find an the egress LSR, L computes the LSP by which MTU for F as
           follows:

           a)  L
           can reach determines its downstream neighbors for FEC F.

           b)  For each downstream neighbor Z, L computes the peer.  (Note that this procedure may be
           recursively applied.)  Once minimum of
               the Hop MTU to Z and the appropriate LSP has been
           determined, MTU in the MTU TLV that Z
               advertised to L.  If Z did not include the MTU TLV in its
               Label Mapping, then Z's LSP MTU is calculated according set to the rules in
           Section 2.2, using the 65535.

           c)  L sets its LSP MTU of to the selected LSP as minimum of the link
           MTU. MTUs it computed
               for its downstream neighbors.

   2.  For each direct LDP neighbor (direct or targetted) of L to which L
       decides to send a Mapping for a FEC, FEC F, L attaches an MTU TLV with
       the MTU that it computed for this FEC.  Mapping messages sent to "remote" LDP
       neighbors need not have an  L MAY (because of policy
       or other reasons) advertise a smaller MTU TLV. than it has computed,
       but L MUST NOT advertise a larger MTU.

   3.  When a new MTU is received for a label mapping FEC F from a downstream LSR, or
       the active Mapping set of downstream LSRs for a FEC F changes, L returns to Step 1.
       If the newly computed LSP MTU is unchanged, L does not SHOULD NOT
       advertise new information to its neighbors.  Otherwise, L
       readvertises its Mappings for F to all its peers with an updated
       MTU TLV.

       This behavior is standard for attributes such as path vector and
       hop count, and the same rules apply, as specified in [3].

   4.  In some cases, a node may act as both an LER and an LSR for the
       same LSP.  In these situations, the node will calculate multiple
       MTUs: the MTU advertised to upstream LSRs for labelled traffic
       and the MTU used locally when processing unlabelled traffic.  The
       procedure for calculating each of these MTUs is unchanged from
       the steps above, although the series of steps taken will differ
       depending on which MTU is being calculated.

2.2 Calculating Local MTU

   There is a wide variety of policies which may be used in determining
   the MTU advertised by a node, however there are restrictions which
   MUST be adhered to in order to ensure proper operation of MTU
   signalling and minimization of signalling traffic during topology
   changes.

      If the local policy is based entirely on the egress interface for
      the LSP, the calculated MTU must be less than or equal to the
      egress interface MTU.

      If the local policy is based on a group of egress interfaces, the
      calculated MTU MUST be less than or equal to the MTU of the egress
      interface with the largest MTU in the group minus any label
      overhead, but SHOULD be less than or equal to the MTU of the
      egress interface with the smallest MTU in the group minus any
      label overhead.

      If the local LSR is the ingress LER for and the FEC same rules apply, as specified in question, any
      label overhead introduced must be subtracted from [2].

       If the calculated LSP MTU to determine the actual path MTU.  For example, if 2 labels
      are pushed onto the stack, the LSR MUST subtract 8 bytes from decreases, L SHOULD readvertise the new MTU value it has calculated based on local link MTUs and MTU
      values received from downstream LDP neighbors.

      Under no circumstances must
       immediately; if the advertised LSP MTU exceed increases, L MAY hold down the received
      MTU.

2.3
       readvertisement.

2.4. MTU TLV

   The MTU TLV encodes information on the maximum transmission unit for
   an LSP, either for from the entire path or only for a segment of advertising LSR to the path. egress(es) over all valid
   paths.

   The encoding for the MTU TLV is:

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1|0|
      |1|1|      MTU TLV (0x0XXX)     |            Length             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |              MTU              |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   MTU

   This is a 16-bit unsigned integer that represents the MTU in bytes octets
   for an LSP or segment of an LSP.

   Note that the U and F bits are set.  An LSR that doesn't recognize
   the MTU TLV MUST ignore it when it processes the Label Mapping
   message, and forward the TLV to its peers.  This may result in the
   incorrect computation of the LSP MTU; however, silently forwarding
   the MTU TLV preserves maximal amount of information about the LSP
   MTU.

3. Example of Operation

   Consider the example network in section 2.2.  The figure following table
   describes, for each LSR, the links to its downstream LSRs, the Hop
   MTU for the peer, the LSP MTU received from the peer, and below describes a simple the LSR's
   computed LSP MTU.

        LSR topology.  Ri and Re are  |  Link  |  Hop MTU  |  Recvd MTU  |  LSP MTU
        --------------------------------------------------
         F   |    -   |    65535  |      -      |    65535
        --------------------------------------------------
         E   |    R   |     4466  |  F:  65535  |     4466
        --------------------------------------------------
         D   |    Q   |     4466  |  E:   4466  |     4466
        --------------------------------------------------
         C   |    P   |     1496  |  E:   4466  |     1496
        --------------------------------------------------
         B   |    M   |     4466  |  C:   1496  |
             |    N   |     1496  |  D:   4466  |     1496
        --------------------------------------------------
         A   |    L   |     9212  |  B:   1496  |     1496
        --------------------------------------------------

   Now consider the ingress and egress LSRs for same network with the following changes: there is an
   LSP P1.  Rx X from B to E, and Re a targetted LDP session from B to E.  B's peer
   LSRs are the ingress A, C, D and egress E; B's downstream LSRs for LSP P2.  From Rx are D and E; to reach E,
   B chooses to Re, go over X.  The LSP P1 is encapsulated in MTU for LSP P2.  Ry X is an intermediate 1496.

        LSR which does not act as ingress or
   egress for any LSPs.  L1 through L3 are links connecting the LSRs.
   Le is the egress link.  |  Link  |  Hop MTU  |  Recvd MTU
                                                       Media    w/ P2
        +--+      +--+      +--+      +--+       Link  |  LSP MTU    overhead
      --|Ri|--L1--|Rx|--L2--|Ry|--L3--|Re|--Le   ----  ------  --------
        +--+      +--+      +--+      +--+        L1    9216     9216
          |         |                  ^^         L2    4470
        --------------------------------------------------
         F   |    -   |    65535  |      -      |    65535
        --------------------------------------------------
         E   |    R   |     4466  |  F:  65535  |                  ||         L3    9216     9212     4466
        --------------------------------------------------
         D   |    Q   |         +---P2-------------+|         Le    9216     9216     4466  |  E:   4466  |
          +-------------P1--------------+

   Figure 1.  Sample LSR Topology

   The following four time steps illustrate the calculation of     4466
        --------------------------------------------------
         C   |    P   |     1496  |  E:   4466  |     1496
        --------------------------------------------------
         B   |    X   |     1492  |  E:   4466  |
             |    N   |     1496  |  D:   4466  |     1492
        --------------------------------------------------
         A   |    L   |     9212  |  B:   1492  |     1492
        --------------------------------------------------

4. Using the MTU
   for P1.  Let FEC F represent traffic mapped to LSP P1.

   At t[0]:

   1) Re sets the MTU for F to 9216 (the

   An ingress LSR that forwards an IP packet into an LSP whose MTU of it
   knows MUST either fragment the egress interface)
   and sends a Mapping message for F IP packet to Ry.

   2) Ri, Rx, and Ry have not received Mappings for F.

   At t[1]:

   1) Ry receives a Mapping for F from Re with an the LSP's MTU of 9216.  Ry
   compares 9216 to 9216 (Ry does not push a label onto (if the stack for
   either P1
   Don't Fragment bit is clear) or P2), and sends a mapping message for F with an MTU of
   9216 to Rx.

   2) Ri drop the packet and Rx have not received Mappings for F.

   At t[2]:

   1) Rx receives a Mapping for F from Ry respond with an MTU of 9216.  Rx
   compares 9212 (9216 - 4) to 4466, and sends a Mapping
   ICMP Destination Unreachable message for F
   with an MTU of 4466 to Ri.

   2) Ri has not received Mappings for F.

   At t[3]:

   1) Ri receives a Mapping for F from Rx with an MTU the source of 4462.  Ri
   compares 4466 to 9216, the packet,
   with the Code indicating "fragmentation needed and DF set", and sets the
   Next-Hop MTU for P1 set to 4462 (4466 minus the overhead of 1 label pushed onto LSP MTU.  In other words, the stack).

4. LSR behaves as
   RFC 1191 says, except it treats the LSP as the next hop "network".

   If the payload for the LSP is not an IP packet, the LSR MUST forward
   the packet if it fits (size <= LSP MTU), and SHOULD drop it if it
   doesn't fit.

5. Protocol Interaction

4.1

5.1. Interaction With LSRs Which Do Not Support MTU Signalling

   Changes in MTU for sections of an LSP may cause intermediate LSRs to
   generate unsolicited label Mapping messages to advertise the new MTU.
   LSRs which do not support MTU signalling MUST accept these messages,
   but MAY ignore them (see Section 2.1).

4.2

5.2. Interaction with CR-LDP and RSVP-TE

   The MTU TLV can be used to discover the Path MTU of both LDP LSPs and
   CR-LDP LSPs.  This proposal is not impacted in the presence of LSPs
   created using CR-LDP, as specified in [2]. [5].

   Note that LDP/CR-LDP LSPs may tunnel through other LSPs signalled
   using LDP, CR-LDP or RSVP-TE [5]; [6]; the mechanism suggested here
   applies in all these cases.

5. cases, essentially by treating the tunnel LSPs
   as links.

Normative References

   [1]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", BCP 14, RFC 2119, March 1997

   [2]  Andersson, L., Doolan, P., Feldman, N., Fredette, A. and B.
        Thomas, "LDP Specification", RFC 3036, January 2001.

   [3]  Mogul, J. and S. Deering, "Path MTU Discovery", RFC 1191,
        November 1990.

   [4]  Rosen, E., Tappan, D., Federkow, G., Rekhter, Y., Farinacci, D.,
        Li, T. and A. Conta, "MPLS Label Stack Encoding", RFC 3032,
        January 2001.

   [5]  Jamoussi, J., "Constraint-Based LSP Setup Using LDP", July 2000.

   [6]  Awduche, D., Berger, L. and D. Gan, "RSVP-TE: Extensions to RSVP
        for LSP Tunnels", February 2001.

Security Considerations

   This mechanism does not introduce any new weaknesses in LDP.  It is
   possible to spoof TCP packets belonging to an LDP session to
   manipulate the LSP MTU, but this sort of attack is not new to LDP.

6.

IANA Considerations

   A new LDP TLV Type is defined in section 2.4.  A Type has to be
   allocated by IANA; a number from the range 0x0000 - 0x3DFF is
   requested.

Acknowledgments

   We would like to thank Andre Fredette for a number of detailed
   comments on earlier versions of the signalling mechanism.  Eric Gray
   and Giles Heron have contributed numerous useful suggestions.

References (Normative)

   [1]  Mogul, J. and S. Deering, "Path MTU Discovery", RFC 1191,
        November 1990.

   [2]  Jamoussi, J., "Constraint-Based LSP Setup Using LDP", July 2000.

   [3]  Andersson, L., Doolan, P., Feldman, N., Fredette, A. and B.
        Thomas, "LDP Specification", RFC 3036, January 2001.

   [4]  Rosen, E., Tappan, D., Federkow, G., Rekhter, Y., Farinacci, D.,
        Li, T. and A. Conta, "MPLS Label Stack Encoding", RFC 3032,
        January 2001.

   [5]  Awduche, D., Berger, L. and D. Gan, "RSVP-TE: Extensions to RSVP
        for LSP Tunnels", February 2001.

Authors' Addresses

   Benjamin Black
   Layer8 Networks

   EMail: ben@layer8.net

   Kireeti Kompella
   Juniper Networks
   1194 N. Mathilda Ave
   Sunnyvale, CA  94089
   US

   EMail: kireeti@juniper.net

IPR Notice

   The IETF takes no position regarding the validity or scope of any
   intellectual property or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; neither does it represent that it
   has made any effort to identify any such rights.  Information on the
   IETF's procedures with respect to rights in standards-track and
   standards-related documentation can be found in BCP-11.  Copies of
   claims of rights made available for publication and any assurances of
   licenses to be made available, or the result of an attempt made to
   obtain a general license or permission for the use of such
   proprietary rights by implementors or users of this specification can
   be obtained from the IETF Secretariat.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights which may cover technology that may be required to practice
   this standard.  Please address the information to the IETF Executive
   Director.

Full Copyright Statement Notice

   Copyright (C) The Internet Society (2002). (2003).  All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works.  However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. PURPOSE."

Acknowledgement

   Funding for the RFC Editor function is currently provided by the
   Internet Society.