draft-ietf-6man-rfc1981bis-06.txt   draft-ietf-6man-rfc1981bis-07.txt 
Network Working Group J. McCann Network Working Group J. McCann
Internet-Draft Digital Equipment Corporation Internet-Draft Digital Equipment Corporation
Obsoletes: 1981 (if approved) S. Deering Obsoletes: 1981 (if approved) S. Deering
Intended status: Standards Track Retired Intended status: Standards Track Retired
Expires: October 9, 2017 J. Mogul Expires: November 17, 2017 J. Mogul
Digital Equipment Corporation Digital Equipment Corporation
R. Hinden, Ed. R. Hinden, Ed.
Check Point Software Check Point Software
April 7, 2017 May 16, 2017
Path MTU Discovery for IP version 6 Path MTU Discovery for IP version 6
draft-ietf-6man-rfc1981bis-06 draft-ietf-6man-rfc1981bis-07
Abstract Abstract
This document describes Path MTU Discovery for IP version 6. It is This document describes Path MTU Discovery for IP version 6. It is
largely derived from RFC 1191, which describes Path MTU Discovery for largely derived from RFC 1191, which describes Path MTU Discovery for
IP version 4. It obsoletes RFC1981. IP version 4. It obsoletes RFC1981.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
skipping to change at page 1, line 37 skipping to change at page 1, line 37
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on October 9, 2017. This Internet-Draft will expire on November 17, 2017.
Copyright Notice Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 39 skipping to change at page 2, line 39
5.3. Purging stale PMTU information . . . . . . . . . . . . . 10 5.3. Purging stale PMTU information . . . . . . . . . . . . . 10
5.4. Packetization layer actions . . . . . . . . . . . . . . . 11 5.4. Packetization layer actions . . . . . . . . . . . . . . . 11
5.5. Issues for other transport protocols . . . . . . . . . . 12 5.5. Issues for other transport protocols . . . . . . . . . . 12
5.6. Management interface . . . . . . . . . . . . . . . . . . 13 5.6. Management interface . . . . . . . . . . . . . . . . . . 13
6. Security Considerations . . . . . . . . . . . . . . . . . . . 13 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 14 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 14
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 14
9.1. Normative References . . . . . . . . . . . . . . . . . . 14 9.1. Normative References . . . . . . . . . . . . . . . . . . 14
9.2. Informative References . . . . . . . . . . . . . . . . . 14 9.2. Informative References . . . . . . . . . . . . . . . . . 14
Appendix A. Comparison to RFC 1191 . . . . . . . . . . . . . . . 15 Appendix A. Comparison to RFC 1191 . . . . . . . . . . . . . . . 16
Appendix B. Changes Since RFC 1981 . . . . . . . . . . . . . . . 16 Appendix B. Changes Since RFC 1981 . . . . . . . . . . . . . . . 16
B.1. Change History Since RFC1981 . . . . . . . . . . . . . . 17 B.1. Change History Since RFC1981 . . . . . . . . . . . . . . 17
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21
1. Introduction 1. Introduction
When one IPv6 node has a large amount of data to send to another When one IPv6 node has a large amount of data to send to another
node, the data is transmitted in a series of IPv6 packets. These node, the data is transmitted in a series of IPv6 packets. These
packets can have a size less than or equal to the Path MTU (PMTU). packets can have a size less than or equal to the Path MTU (PMTU).
Alternatively, they can be larger packets that are fragmented into a Alternatively, they can be larger packets that are fragmented into a
series of fragments each with a size less than or equal to the PMTU. series of fragments each with a size less than or equal to the PMTU.
It is usually preferable that these packets be of the largest size It is usually preferable that these packets be of the largest size
that can successfully traverse the path from the source node to the that can successfully traverse the path from the source node to the
destination node without the need for IPv6 fragmentation. This destination node without the need for IPv6 fragmentation. This
packet size is referred to as the Path MTU, and it is equal to the packet size is referred to as the Path MTU, and it is equal to the
minimum link MTU of all the links in a path. This document defines a minimum link MTU of all the links in a path. This document defines a
standard mechanism for a node to discover the PMTU of an arbitrary standard mechanism for a node to discover the PMTU of an arbitrary
path. path.
IPv6 nodes SHOULD implement Path MTU Discovery in order to discover IPv6 nodes should implement Path MTU Discovery in order to discover
and take advantage of paths with PMTU greater than the IPv6 minimum and take advantage of paths with PMTU greater than the IPv6 minimum
link MTU [I-D.ietf-6man-rfc2460bis]. A minimal IPv6 implementation link MTU [I-D.ietf-6man-rfc2460bis]. A minimal IPv6 implementation
(e.g., in a boot ROM) may choose to omit implementation of Path MTU (e.g., in a boot ROM) may choose to omit implementation of Path MTU
Discovery. Discovery.
Nodes not implementing Path MTU Discovery MUST use the IPv6 minimum Nodes not implementing Path MTU Discovery must use the IPv6 minimum
link MTU defined in [I-D.ietf-6man-rfc2460bis] as the maximum packet link MTU defined in [I-D.ietf-6man-rfc2460bis] as the maximum packet
size. In most cases, this will result in the use of smaller packets size. In most cases, this will result in the use of smaller packets
than necessary, because most paths have a PMTU greater than the IPv6 than necessary, because most paths have a PMTU greater than the IPv6
minimum link MTU. A node sending packets much smaller than the Path minimum link MTU. A node sending packets much smaller than the Path
MTU allows is wasting network resources and probably getting MTU allows is wasting network resources and probably getting
suboptimal throughput. suboptimal throughput.
Nodes implementing Path MTU Discovery and sending packets larger than Nodes implementing Path MTU Discovery and sending packets larger than
the IPv6 minimum link MTU are susceptible to problematic connectivity the IPv6 minimum link MTU are susceptible to problematic connectivity
if ICMPv6 [ICMPv6] messages are blocked or not transmitted. For if ICMPv6 [ICMPv6] messages are blocked or not transmitted. For
example, this will result in connections that complete the TCP three- example, this will result in connections that complete the TCP three-
way handshake correctly but then hang when data is transferred. This way handshake correctly but then hang when data is transferred. This
state is referred to as a black hole connection. Path MTU Discovery state is referred to as a black hole connection [RFC2923]. Path MTU
relies on such messages to determine the MTU of the path. Discovery relies on such messages to determine the MTU of the path.
An extension to Path MTU Discovery defined in this document can be An extension to Path MTU Discovery defined in this document can be
found in [RFC4821]. RFC4821 defines a method for Packetization Layer found in [RFC4821]. RFC4821 defines a method for Packetization Layer
Path MTU Discovery (PLPMTUD) designed for use over paths where Path MTU Discovery (PLPMTUD) designed for use over paths where
delivery of ICMPv6 messages to a host is not assured. delivery of ICMPv6 messages to a host is not assured.
Note: This document is an update to [RFC1981] that was published
prior to [RFC2119] being published. Consequently although RFC1981
used the "should/must" style language in upper and lower case, the
document does not cite the RFC2119 definitions and only uses lower
case for these words.
2. Terminology 2. Terminology
node a device that implements IPv6. node a device that implements IPv6.
router a node that forwards IPv6 packets not explicitly router a node that forwards IPv6 packets not explicitly
addressed to itself. addressed to itself.
host any node that is not a router. host any node that is not a router.
upper layer a protocol layer immediately above IPv6. upper layer a protocol layer immediately above IPv6.
skipping to change at page 4, line 43 skipping to change at page 4, line 50
path MTU the minimum link MTU of all the links in a path path MTU the minimum link MTU of all the links in a path
between a source node and a destination node. between a source node and a destination node.
PMTU path MTU PMTU path MTU
Path MTU Discovery process by which a node learns the PMTU of a path Path MTU Discovery process by which a node learns the PMTU of a path
EMTU_S Effective MTU for sending, used by upper layer EMTU_S Effective MTU for sending, used by upper layer
protocols to limit the size of IP packets they protocols to limit the size of IP packets they
queue for sending [RFC6691]. queue for sending [RFC6691] [RFC1122].
EMTU_R Effective MTU for receiving, the largest packet EMTU_R Effective MTU for receiving, the largest packet
that can be reassembled at the receiver. that can be reassembled at the receiver
[RFC1122].
flow a sequence of packets sent from a particular flow a sequence of packets sent from a particular
source to a particular (unicast or multicast) source to a particular (unicast or multicast)
destination for which the source desires special destination for which the source desires special
handling by the intervening routers. handling by the intervening routers.
flow id a combination of a source address and a non-zero flow id a combination of a source address and a non-zero
flow label. flow label.
3. Protocol Overview 3. Protocol Overview
This memo describes a technique to dynamically discover the PMTU of a This memo describes a technique to dynamically discover the PMTU of a
path. The basic idea is that a source node initially assumes that path. The basic idea is that a source node initially assumes that
the PMTU of a path is the (known) MTU of the first hop in the path. the PMTU of a path is the (known) MTU of the first hop in the path.
If any of the packets sent on that path are too large to be forwarded If any of the packets sent on that path are too large to be forwarded
by some node along the path, that node will discard them and return by some node along the path, that node will discard them and return
ICMPv6 Packet Too Big messages. Upon receipt of such a message, the ICMPv6 Packet Too Big (PTB) messages. Upon receipt of such a
source node reduces its assumed PMTU for the path based on the MTU of message, the source node reduces its assumed PMTU for the path based
the constricting hop as reported in the Packet Too Big message. The on the MTU of the constricting hop as reported in the Packet Too Big
decreased PMTU causes the source to send smaller fragments or change message. The decreased PMTU causes the source to send smaller
EMTU_S to cause upper layer to reduce the size of IP packets it packets or change EMTU_S to cause upper layer to reduce the size of
sends. IP packets it sends.
The Path MTU Discovery process ends when the node's estimate of the The Path MTU Discovery process ends when the source node's estimate
PMTU is less than or equal to the actual PMTU. Note that several of the PMTU is less than or equal to the actual PMTU. Note that
iterations of the packet-sent/Packet-Too-Big-message-received cycle several iterations of the packet-sent/Packet-Too-Big-message-received
may occur before the Path MTU Discovery process ends, as there may be cycle may occur before the Path MTU Discovery process ends, as there
links with smaller MTUs further along the path. may be links with smaller MTUs further along the path.
Alternatively, the node may elect to end the discovery process by Alternatively, the node may elect to end the discovery process by
ceasing to send packets larger than the IPv6 minimum link MTU. ceasing to send packets larger than the IPv6 minimum link MTU.
The PMTU of a path may change over time, due to changes in the The PMTU of a path may change over time, due to changes in the
routing topology. Reductions of the PMTU are detected by Packet Too routing topology. Reductions of the PMTU are detected by Packet Too
Big messages. To detect increases in a path's PMTU, a node Big messages. To detect increases in a path's PMTU, a node
periodically increases its assumed PMTU. This will almost always periodically increases its assumed PMTU. This will almost always
result in packets being discarded and Packet Too Big messages being result in packets being discarded and Packet Too Big messages being
generated, because in most cases the PMTU of the path will not have generated, because in most cases the PMTU of the path will not have
skipping to change at page 6, line 4 skipping to change at page 6, line 12
packet may traverse many different paths to many different nodes. packet may traverse many different paths to many different nodes.
Each path may have a different PMTU, and a single multicast packet Each path may have a different PMTU, and a single multicast packet
may result in multiple Packet Too Big messages, each reporting a may result in multiple Packet Too Big messages, each reporting a
different next-hop MTU. The minimum PMTU value across the set of different next-hop MTU. The minimum PMTU value across the set of
paths in use determines the size of subsequent packets sent to the paths in use determines the size of subsequent packets sent to the
multicast destination. multicast destination.
Note that Path MTU Discovery must be performed even in cases where a Note that Path MTU Discovery must be performed even in cases where a
node "thinks" a destination is attached to the same link as itself. node "thinks" a destination is attached to the same link as itself.
In a situation such as when a neighboring router acts as proxy [ND] In a situation such as when a neighboring router acts as proxy [ND]
for some destination, the destination can to appear to be directly for some destination, the destination can appear to be directly
connected but is in fact more than one hop away. connected but it is in fact more than one hop away.
4. Protocol Requirements 4. Protocol Requirements
As discussed in Section 1, IPv6 nodes are not required to implement As discussed in Section 1, IPv6 nodes are not required to implement
Path MTU Discovery. The requirements in this section apply only to Path MTU Discovery. The requirements in this section apply only to
those implementations that include Path MTU Discovery. those implementations that include Path MTU Discovery.
Nodes SHOULD appropriately validate the payload of ICMPv6 PTB Nodes should appropriately validate the payload of ICMPv6 PTB
messages to ensure these are received in response to transmitted messages to ensure these are received in response to transmitted
traffic (i.e., a reported error condition that corresponds to an IPv6 traffic (i.e., a reported error condition that corresponds to an IPv6
packet actually sent by the application) per [ICMPv6]. packet actually sent by the application) per [ICMPv6].
If a node receives a Packet Too Big message reporting a next-hop MTU If a node receives a Packet Too Big message reporting a next-hop MTU
that is less than the IPv6 minimum link MTU, it MUST discard it. A that is less than the IPv6 minimum link MTU, it must discard it. A
node MUST NOT reduce its estimate of the Path MTU below the IPv6 node must not reduce its estimate of the Path MTU below the IPv6
minimum link MTU. minimum link MTU on receipt of an Packet Too Big message.
When a node receives a Packet Too Big message, it MUST reduce its When a node receives a Packet Too Big message, it must reduce its
estimate of the PMTU for the relevant path, based on the value of the estimate of the PMTU for the relevant path, based on the value of the
MTU field in the message. The precise behavior of a node in this MTU field in the message. The precise behavior of a node in this
circumstance is not specified, since different applications may have circumstance is not specified, since different applications may have
different requirements, and since different implementation different requirements, and since different implementation
architectures may favor different strategies. architectures may favor different strategies.
After receiving a Packet Too Big message, a node MUST attempt to After receiving a Packet Too Big message, a node must attempt to
avoid eliciting more such messages in the near future. The node MUST avoid eliciting more such messages in the near future. The node must
reduce the size of the packets it is sending along the path. Using a reduce the size of the packets it is sending along the path. Using a
PMTU estimate larger than the IPv6 minimum link MTU may continue to PMTU estimate larger than the IPv6 minimum link MTU may continue to
elicit Packet Too Big messages. Since each of these messages (and elicit Packet Too Big messages. Because each of these messages (and
the dropped packets they respond to) consume network resources, the the dropped packets they respond to) consume network resources, Nodes
node MUST force the Path MTU Discovery process to end. using Path MTU Discovery must detect decreases in PMTU as fast as
possible.
Nodes using Path MTU Discovery MUST detect decreases in PMTU as fast Nodes may detect increases in PMTU, but because doing so requires
as possible. Nodes MAY detect increases in PMTU, but because doing sending packets larger than the current estimated PMTU, and because
so requires sending packets larger than the current estimated PMTU, the likelihood is that the PMTU will not have increased, this must be
and because the likelihood is that the PMTU will not have increased, done at infrequent intervals. An attempt to detect an increase (by
this MUST be done at infrequent intervals. An attempt to detect an sending a packet larger than the current estimate) must not be done
increase (by sending a packet larger than the current estimate) MUST less than 5 minutes after a Packet Too Big message has been received
NOT be done less than 5 minutes after a Packet Too Big message has for the given path. The recommended setting for this timer is twice
been received for the given path. The recommended setting for this its minimum value (10 minutes).
timer is twice its minimum value (10 minutes).
A node MUST NOT increase its estimate of the Path MTU in response to A node must not increase its estimate of the Path MTU in response to
the contents of a Packet Too Big message. A message purporting to the contents of a Packet Too Big message. A message purporting to
announce an increase in the Path MTU might be a stale packet that has announce an increase in the Path MTU might be a stale packet that has
been floating around in the network, a false packet injected as part been floating around in the network, a false packet injected as part
of a denial-of-service attack, or the result of having multiple paths of a denial-of-service attack, or the result of having multiple paths
to the destination, each with a different PMTU. to the destination, each with a different PMTU.
5. Implementation Issues 5. Implementation Issues
This section discusses a number of issues related to the This section discusses a number of issues related to the
implementation of Path MTU Discovery. This is not a specification, implementation of Path MTU Discovery. This is not a specification,
skipping to change at page 7, line 44 skipping to change at page 8, line 4
protocol, it becomes hard to share PMTU information between different protocol, it becomes hard to share PMTU information between different
packetization layers, and the connection-oriented state maintained by packetization layers, and the connection-oriented state maintained by
some packetization layers may not easily extend to save PMTU some packetization layers may not easily extend to save PMTU
information for long periods. information for long periods.
It is therefore suggested that the IP layer store PMTU information It is therefore suggested that the IP layer store PMTU information
and that the ICMPv6 layer process received Packet Too Big messages. and that the ICMPv6 layer process received Packet Too Big messages.
The packetization layers may respond to changes in the PMTU by The packetization layers may respond to changes in the PMTU by
changing the size of the messages they send. To support this changing the size of the messages they send. To support this
layering, packetization layers require a way to learn of changes in layering, packetization layers require a way to learn of changes in
the value of MMS_S, the "maximum send transport-message size". the value of MMS_S, the "maximum send transport-message size"
[RFC1122].
MMS_S is a transport message size calculated by subtracting the size MMS_S is a transport message size calculated by subtracting the size
of the IPv6 header (including IPv6 extension headers) from the of the IPv6 header (including IPv6 extension headers) from the
largest IP packet that can be sent, EMTU_S. MMS_S is limited by a largest IP packet that can be sent, EMTU_S. MMS_S is limited by a
combination of factors, including the PMTU, support for packet combination of factors, including the PMTU, support for packet
fragmentation and reassembly, and the packet reassembly limit (see fragmentation and reassembly, and the packet reassembly limit (see
[I-D.ietf-6man-rfc2460bis] section "Fragment Header"). When source [I-D.ietf-6man-rfc2460bis] section "Fragment Header"). When source
fragmentation is available, EMTU_S is set to EMTU_R, as indicated by fragmentation is available, EMTU_S is set to EMTU_R, as indicated by
the receiver using an upper layer protocol or based on protocol the receiver using an upper layer protocol or based on protocol
requirements (1500 octets for IPv6). When a message larger than PMTU requirements (1500 octets for IPv6). When a message larger than PMTU
skipping to change at page 8, line 25 skipping to change at page 8, line 34
fragmentation, see [FRAG]). fragmentation, see [FRAG]).
5.2. Storing PMTU information 5.2. Storing PMTU information
Ideally, a PMTU value should be associated with a specific path Ideally, a PMTU value should be associated with a specific path
traversed by packets exchanged between the source and destination traversed by packets exchanged between the source and destination
nodes. However, in most cases a node will not have enough nodes. However, in most cases a node will not have enough
information to completely and accurately identify such a path. information to completely and accurately identify such a path.
Rather, a node must associate a PMTU value with some local Rather, a node must associate a PMTU value with some local
representation of a path. It is left to the implementation to select representation of a path. It is left to the implementation to select
the local representation of a path. the local representation of a path. For nodes with multiple
interfaces, Path MTU information should be maintained for each IPv6
link.
In the case of a multicast destination address, copies of a packet In the case of a multicast destination address, copies of a packet
may traverse many different paths to reach many different nodes. The may traverse many different paths to reach many different nodes. The
local representation of the "path" to a multicast destination must local representation of the "path" to a multicast destination must
represent a potentially large set of paths. represent a potentially large set of paths.
Minimally, an implementation could maintain a single PMTU value to be Minimally, an implementation could maintain a single PMTU value to be
used for all packets originated from the node. This PMTU value would used for all packets originated from the node. This PMTU value would
be the minimum PMTU learned across the set of all paths in use by the be the minimum PMTU learned across the set of all paths in use by the
node. This approach is likely to result in the use of smaller node. This approach is likely to result in the use of smaller
skipping to change at page 10, line 24 skipping to change at page 10, line 35
normal timeout-based retransmission mechanisms would be used to normal timeout-based retransmission mechanisms would be used to
recover from the dropped packets. recover from the dropped packets.
It is important to understand that the notification of the It is important to understand that the notification of the
packetization layer instances using the path about the change in the packetization layer instances using the path about the change in the
PMTU is distinct from the notification of a specific instance that a PMTU is distinct from the notification of a specific instance that a
packet has been dropped. The latter should be done as soon as packet has been dropped. The latter should be done as soon as
practical (i.e., asynchronously from the point of view of the practical (i.e., asynchronously from the point of view of the
packetization layer instance), while the former may be delayed until packetization layer instance), while the former may be delayed until
a packetization layer instance wants to create a packet. a packetization layer instance wants to create a packet.
Retransmission should be done for only for those packets that are
known to be dropped, as indicated by a Packet Too Big message.
5.3. Purging stale PMTU information 5.3. Purging stale PMTU information
Internetwork topology is dynamic; routes change over time. While the Internetwork topology is dynamic; routes change over time. While the
local representation of a path may remain constant, the actual local representation of a path may remain constant, the actual
path(s) in use may change. Thus, PMTU information cached by a node path(s) in use may change. Thus, PMTU information cached by a node
can become stale. can become stale.
If the stale PMTU value is too large, this will be discovered almost If the stale PMTU value is too large, this will be discovered almost
immediately once a large enough packet is sent on the path. No such immediately once a large enough packet is sent on the path. No such
mechanism exists for realizing that a stale PMTU value is too small, mechanism exists for realizing that a stale PMTU value is too small,
so an implementation SHOULD "age" cached values. When a PMTU value so an implementation should "age" cached values. When a PMTU value
has not been decreased for a while (on the order of 10 minutes), the has not been decreased for a while (on the order of 10 minutes), the
PMTU estimate should be set to the MTU of the first-hop link, and the PMTU estimate should be set to the MTU of the first-hop link, and the
packetization layers should be notified of the change. This will packetization layers should be notified of the change. This will
cause the complete Path MTU Discovery process to take place again. cause the complete Path MTU Discovery process to take place again.
Note: an implementation should provide a means for changing the Note: an implementation should provide a means for changing the
timeout duration, including setting it to "infinity". For timeout duration, including setting it to "infinity". For
example, nodes attached to an FDDI link which is then attached to example, nodes attached to an FDDI link which is then attached to
the rest of the Internet via a small MTU serial line are never the rest of the Internet via a small MTU serial line are never
going to discover a new non-local PMTU, so they should not have to going to discover a new non-local PMTU, so they should not have to
put up with dropped packets every 10 minutes. put up with dropped packets every 10 minutes.
An upper layer must not retransmit data in response to an increase in
the PMTU estimate, since this increase never comes in response to an
indication of a dropped packet.
One approach to implementing PMTU aging is to associate a timestamp One approach to implementing PMTU aging is to associate a timestamp
field with a PMTU value. This field is initialized to a "reserved" field with a PMTU value. This field is initialized to a "reserved"
value, indicating that the PMTU is equal to the MTU of the first hop value, indicating that the PMTU is equal to the MTU of the first hop
link. Whenever the PMTU is decreased in response to a Packet Too Big link. Whenever the PMTU is decreased in response to a Packet Too Big
message, the timestamp is set to the current time. message, the timestamp is set to the current time.
Once a minute, a timer-driven procedure runs through all cached PMTU Once a minute, a timer-driven procedure runs through all cached PMTU
values, and for each PMTU whose timestamp is not "reserved" and is values, and for each PMTU whose timestamp is not "reserved" and is
older than the timeout interval: older than the timeout interval:
- The PMTU estimate is set to the MTU of the first hop link. - The PMTU estimate is set to the MTU of the first hop link.
- The timestamp is set to the "reserved" value. - The timestamp is set to the "reserved" value.
- Packetization layers using this path are notified of the increase. - Packetization layers using this path are notified of the increase.
5.4. Packetization layer actions 5.4. Packetization layer actions
A packetization layer (e.g., TCP) must track the PMTU for the path(s) A packetization layer (e.g., TCP) must use the PMTU for the path(s)
in use by a connection; it should not send segments that would result in use by a connection; it should not send segments that would result
in packets larger than the PMTU, except to probe during PMTU in packets larger than the PMTU, except to probe during PMTU
discovery (this probe packet must not be fragmented to the PMTU). A discovery (this probe packet must not be fragmented to the PMTU). A
simple implementation could ask the IP layer for this value each time simple implementation could ask the IP layer for this value each time
it created a new segment, but this could be inefficient. An it created a new segment, but this could be inefficient. An
implementation typically caches other values derived from the PMTU. implementation typically caches other values derived from the PMTU.
It may be simpler to receive asynchronous notification when the PMTU It may be simpler to receive asynchronous notification when the PMTU
changes, so that these variables may be also updated. changes, so that these variables may be also updated.
A TCP implementation must also store the Maximum Segment Size (MSS) A TCP implementation must also store the Maximum Segment Size (MSS)
skipping to change at page 11, line 45 skipping to change at page 12, line 5
largest packet that can be reassembled by the receiver, and must not largest packet that can be reassembled by the receiver, and must not
send any segment larger than this MSS, regardless of the PMTU. send any segment larger than this MSS, regardless of the PMTU.
The value sent in the TCP MSS option is independent of the PMTU; it The value sent in the TCP MSS option is independent of the PMTU; it
is determined by the receiver reassembly limit EMTU_R. This MSS is determined by the receiver reassembly limit EMTU_R. This MSS
option value is used by the other end of the connection, which may be option value is used by the other end of the connection, which may be
using an unrelated PMTU value. See [I-D.ietf-6man-rfc2460bis] using an unrelated PMTU value. See [I-D.ietf-6man-rfc2460bis]
sections "Packet Size Issues" and "Maximum Upper-Layer Payload Size" sections "Packet Size Issues" and "Maximum Upper-Layer Payload Size"
for information on selecting a value for the TCP MSS option. for information on selecting a value for the TCP MSS option.
When a Packet Too Big message is received, it implies that a packet Reception of a Packet Too Big message implies that a packet was
was dropped by the node that sent the ICMPv6 message. It is dropped by the node that sent the ICMPv6 message. A reliable upper
sufficient to treat this in the same way as any other dropped layer protocol will detect this loss by its own means, and recover it
segment, and will be recovered by normal retransmission methods. If by its normal retransmission methods. The retransmission could
the Path MTU Discovery process requires several steps to find the result in delay, depending on the loss detection method used by the
PMTU of the full path, this could delay the connection by many round- upper layer protocol. If the Path MTU Discovery process requires
trip times. several steps to find the PMTU of the full path, this could finally
delay the retransmission by many round-trip times.
Alternatively, the retransmission could be done in immediate response Alternatively, the retransmission could be done in immediate response
to a notification that the Path MTU has changed, but only for the to a notification that the Path MTU was decreased, but only for the
specific connection specified by the Packet Too Big message. The specific connection specified by the Packet Too Big message, but only
packet size used in the retransmission should be no larger than the based on the message and connection. The packet size used in the
new PMTU. retransmission should be no larger than the new PMTU.
Note: A packetization layer must not retransmit in response to Note: A packetization layer must not retransmit in response to
every Packet Too Big message, since a burst of several oversized every Packet Too Big message, since a burst of several oversized
segments will give rise to several such messages and hence several segments will give rise to several such messages and hence several
retransmissions of the same data. If the new estimated PMTU is retransmissions of the same data. If the new estimated PMTU is
still wrong, the process repeats, and there is an exponential still wrong, the process repeats, and there is an exponential
growth in the number of superfluous segments sent. growth in the number of superfluous segments sent.
Retransmissions can increase network load in response to Retransmissions can increase network load in response to
congestion, worsening that congestion. Any packetization layer congestion, worsening that congestion. Any packetization layer
that uses retransmission is responsible for congestion control of that uses retransmission is responsible for congestion control of
its retransmissions. See [RFC8085] for more information. its retransmissions. See [RFC8085] for more information.
This means that the TCP layer must be able to recognize when a A loss caused by a PMTU probe indicated by the reception of a Packet
Packet Too Big notification actually decreases the PMTU that it Too Big message must not be considered as a congestion notification
has already used to send a packet on the given connection, and and hence the congestion window may not change.
should ignore any other notifications.
Many TCP implementations incorporate "congestion avoidance" and
"slow-start" algorithms to improve performance [CONG]. Unlike a
retransmission caused by a TCP retransmission timeout, a
retransmission caused by a Packet Too Big message should not change
the congestion window. It should, however, trigger the slow-start
mechanism (i.e., only one segment should be retransmitted until
acknowledgements begin to arrive again).
TCP performance can be reduced if the sender's maximum window size is
not an exact multiple of the segment size in use (this is not the
congestion window size).
5.5. Issues for other transport protocols 5.5. Issues for other transport protocols
Some transport protocols are not allowed to repacketize when doing a Some transport protocols are not allowed to repacketize when doing a
retransmission. That is, once an attempt is made to transmit a retransmission. That is, once an attempt is made to transmit a
segment of a certain size, the transport cannot split the contents of segment of a certain size, the transport cannot split the contents of
the segment into smaller segments for retransmission. In such a the segment into smaller segments for retransmission. In such a
case, the original segment can be fragmented by the IP layer during case, the original segment can be fragmented by the IP layer during
retransmission. Subsequent segments, when transmitted for the first retransmission. Subsequent segments, when transmitted for the first
time, should be no larger than allowed by the Path MTU. time, should be no larger than allowed by the Path MTU.
Path MTU Discovery for IPv4 [RFC1191] used NFS as an example of a Path MTU Discovery for IPv4 [RFC1191] used NFS as an example of a
UDP-based application that benefits from PMTU discovery. Since then UDP-based application that benefits from PMTU discovery. Since then
[RFC7530], states the supported transport layer between NFS and IP [RFC7530], states the supported transport layer between NFS and IP
must be an IETF standardized transport protocol that is specified to must be an IETF standardized transport protocol that is specified to
avoid network congestion; such transports include TCP and the Stream avoid network congestion; such transports include TCP, Stream Control
Control Transmission Protocol (SCTP). In this case, the transport is Transmission Protocol (SCTP) [RFC4960], and the Datagram Congestion
itself responsible for determining and using an effective Path MTU, Control Protocol (DCCP) [RFC4340]. In this case, the transport is
including implementing PMTU discovery when this is needed. responsible for ensuring that transmitted segments (except probes)
conform to the the Path MTU, including supporting PMTU discovery
probe transmissions as needed.
5.6. Management interface 5.6. Management interface
It is suggested that an implementation provide a way for a system It is suggested that an implementation provide a way for a system
utility program to: utility program to:
- Specify that Path MTU Discovery not be done on a given path. - Specify that Path MTU Discovery not be done on a given path.
- Change the PMTU value associated with a given path. - Change the PMTU value associated with a given path.
skipping to change at page 13, line 47 skipping to change at page 13, line 45
set its PMTU estimate below the IPv6 minimum link MTU. A sender set its PMTU estimate below the IPv6 minimum link MTU. A sender
that falsely reduces to this MTU would observe suboptimal that falsely reduces to this MTU would observe suboptimal
performance. performance.
In the second attack, the false message indicates a PMTU larger In the second attack, the false message indicates a PMTU larger
than reality. If believed, this could cause temporary blockage as than reality. If believed, this could cause temporary blockage as
the victim sends packets that will be dropped by some router. the victim sends packets that will be dropped by some router.
Within one round-trip time, the node would discover its mistake Within one round-trip time, the node would discover its mistake
(receiving Packet Too Big messages from that router), but frequent (receiving Packet Too Big messages from that router), but frequent
repetition of this attack could cause lots of packets to be repetition of this attack could cause lots of packets to be
dropped. A node, however, should never raise its estimate of the dropped. A node, however, must not raise its estimate of the PMTU
PMTU based on a Packet Too Big message, so should not be based on a Packet Too Big message, so should not be vulnerable to
vulnerable to this attack. this attack.
Both of these attacks can cause a black hole connection, that is, the
TCP three-way handshake completes correctly but the connection hangs
when data is transfered.
A malicious party could also cause problems if it could stop a victim A malicious party could also cause problems if it could stop a victim
from receiving legitimate Packet Too Big messages, but in this case from receiving legitimate Packet Too Big messages, but in this case
there are simpler denial-of-service attacks available. there are simpler denial-of-service attacks available.
If ICMPv6 filtering prevents reception of ICMPv6 Packet Too Big If ICMPv6 filtering prevents reception of ICMPv6 Packet Too Big
messages, the source will not learn the actual path MTU. messages, the source will not learn the actual path MTU.
Packetization Layer Path MTU Discovery [RFC4821] does not rely upon Packetization Layer Path MTU Discovery [RFC4821] does not rely upon
network support for ICMPv6 messages and is therefore considered more network support for ICMPv6 messages and is therefore considered more
robust than standard PMTUD. It is not susceptible to "black holing" robust than standard PMTUD. It is not susceptible to "black holed"
of ICMPv6 message. See [RFC4890] for recommendations regarding connections caused by filtering of ICMPv6 message. See [RFC4890] for
filtering ICMPv6 messages. recommendations regarding filtering ICMPv6 messages.
7. Acknowledgements 7. Acknowledgements
We would like to acknowledge the authors of and contributors to We would like to acknowledge the authors of and contributors to
[RFC1191], from which the majority of this document was derived. We [RFC1191], from which the majority of this document was derived. We
would also like to acknowledge the members of the IPng working group would also like to acknowledge the members of the IPng working group
for their careful review and constructive criticisms. for their careful review and constructive criticisms.
8. IANA Considerations 8. IANA Considerations
This document does not have any IANA actions This document does not have any IANA actions
9. References 9. References
9.1. Normative References 9.1. Normative References
[I-D.ietf-6man-rfc2460bis] [I-D.ietf-6man-rfc2460bis]
<>, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) <>, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6)
Specification", draft-ietf-6man-rfc2460bis-09 (work in Specification", draft-ietf-6man-rfc2460bis-11 (work in
progress), March 2017. progress), April 2017.
[ICMPv6] Conta, A., Deering, S., and M. Gupta, Ed., "Internet [ICMPv6] Conta, A., Deering, S., and M. Gupta, Ed., "Internet
Control Message Protocol (ICMPv6) for the Internet Control Message Protocol (ICMPv6) for the Internet
Protocol Version 6 (IPv6) Specification", RFC 4443, DOI Protocol Version 6 (IPv6) Specification", RFC 4443, DOI
10.17487/RFC4443, March 2006, 10.17487/RFC4443, March 2006,
<http://www.rfc-editor.org/info/rfc4443>. <http://www.rfc-editor.org/info/rfc4443>.
9.2. Informative References 9.2. Informative References
[CONG] Jacobson, V., "Congestion Avoidance and Control", Proc.
SIGCOMM '88 Symposium on Communications Architectures and
Protocols , August 1988.
[FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful", [FRAG] Kent, C. and J. Mogul, "Fragmentation Considered Harmful",
In Proc. SIGCOMM '87 Workshop on Frontiers in Computer In Proc. SIGCOMM '87 Workshop on Frontiers in Computer
Communications Technology , August 1987. Communications Technology , August 1987.
[ND] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, [ND] Narten, T., Nordmark, E., Simpson, W., and H. Soliman,
"Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861,
DOI 10.17487/RFC4861, September 2007, DOI 10.17487/RFC4861, September 2007,
<http://www.rfc-editor.org/info/rfc4861>. <http://www.rfc-editor.org/info/rfc4861>.
[RFC1122] Braden, R., Ed., "Requirements for Internet Hosts -
Communication Layers", STD 3, RFC 1122, DOI 10.17487/
RFC1122, October 1989,
<http://www.rfc-editor.org/info/rfc1122>.
[RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
DOI 10.17487/RFC1191, November 1990, DOI 10.17487/RFC1191, November 1990,
<http://www.rfc-editor.org/info/rfc1191>. <http://www.rfc-editor.org/info/rfc1191>.
[RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery
for IP version 6", RFC 1981, DOI 10.17487/RFC1981, August
1996, <http://www.rfc-editor.org/info/rfc1981>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/
RFC2119, March 1997,
<http://www.rfc-editor.org/info/rfc2119>.
[RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", RFC
2923, DOI 10.17487/RFC2923, September 2000,
<http://www.rfc-editor.org/info/rfc2923>.
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram
Congestion Control Protocol (DCCP)", RFC 4340, DOI
10.17487/RFC4340, March 2006,
<http://www.rfc-editor.org/info/rfc4340>.
[RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU
Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007,
<http://www.rfc-editor.org/info/rfc4821>. <http://www.rfc-editor.org/info/rfc4821>.
[RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering [RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering
ICMPv6 Messages in Firewalls", RFC 4890, DOI 10.17487/ ICMPv6 Messages in Firewalls", RFC 4890, DOI 10.17487/
RFC4890, May 2007, RFC4890, May 2007,
<http://www.rfc-editor.org/info/rfc4890>. <http://www.rfc-editor.org/info/rfc4890>.
[RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol",
RFC 4960, DOI 10.17487/RFC4960, September 2007,
<http://www.rfc-editor.org/info/rfc4960>.
[RFC6691] Borman, D., "TCP Options and Maximum Segment Size (MSS)", [RFC6691] Borman, D., "TCP Options and Maximum Segment Size (MSS)",
RFC 6691, DOI 10.17487/RFC6691, July 2012, RFC 6691, DOI 10.17487/RFC6691, July 2012,
<http://www.rfc-editor.org/info/rfc6691>. <http://www.rfc-editor.org/info/rfc6691>.
[RFC7530] Haynes, T., Ed. and D. Noveck, Ed., "Network File System [RFC7530] Haynes, T., Ed. and D. Noveck, Ed., "Network File System
(NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530, (NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530,
March 2015, <http://www.rfc-editor.org/info/rfc7530>. March 2015, <http://www.rfc-editor.org/info/rfc7530>.
[RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
skipping to change at page 16, line 16 skipping to change at page 16, line 37
messages messages
Appendix B. Changes Since RFC 1981 Appendix B. Changes Since RFC 1981
This document is based on RFC1981 has the following changes from This document is based on RFC1981 has the following changes from
RFC1981: RFC1981:
o Clarified Section 1 "Introduction" that the purpose of PMTUD is to o Clarified Section 1 "Introduction" that the purpose of PMTUD is to
reduce the need for IPv6 fragmentation. reduce the need for IPv6 fragmentation.
o Added text to Section 1 "Introduction" and Section 6 "Security o Added text to Section 1 "Introduction" about the effects on PMTUD
Considerations" about the effects on PMTUD when ICMPv6 messages when ICMPv6 messages are blocked.
are blocked.
o Added Note to Introduction that document that this document
doesn't cite RFC2119 and only uses lower case "should/must"
language. Changed all upper case "should/must" to lower case.
o Added a short summary to the Section 1 "Introduction" of o Added a short summary to the Section 1 "Introduction" of
Packetization Layer Path MTU Discovery ((PLPMTUD) and a reference Packetization Layer Path MTU Discovery ((PLPMTUD) and a reference
to RFC4821 that defines it. to RFC4821 that defines it.
o Aligned text in Section 2 "Terminology" to match current o Aligned text in Section 2 "Terminology" to match current
packetization layer terminology. packetization layer terminology.
o Added clarification in Section 4 "Protocol Requirements" that o Added clarification in Section 4 "Protocol Requirements" that
nodes should validate the payload of ICMP PTB message per RFC4443. nodes should validate the payload of ICMP PTB message per RFC4443,
and that nodes should detect decreases in PMTU as fast as
possible.
o Remove Note from Section 4 "Protocol Requirements" about a Packet o Remove Note from Section 4 "Protocol Requirements" about a Packet
Too Big message reporting a next-hop MTU that is less than the Too Big message reporting a next-hop MTU that is less than the
IPv6 minimum link MTU because this was removed from IPv6 minimum link MTU because this was removed from
[I-D.ietf-6man-rfc2460bis]. [I-D.ietf-6man-rfc2460bis].
o Added clarification in Section 5.2 "Storing PMTU information" to o Added clarification in Section 5.2 "Storing PMTU information" to
discard an ICMPv6 Packet Too Big message if it contains a MTU less discard an ICMPv6 Packet Too Big message if it contains a MTU less
than the IPv6 minimum link MTU. than the IPv6 minimum link MTU.
o Added clarification Section 5.2 "Storing PMTU information" that
nodes with multiple interface, Path MTU information should be
stored for each link.
o Removed text in Section 5.2 "Storing PMTU information" about the o Removed text in Section 5.2 "Storing PMTU information" about the
RH0 routing header because it was deprecated by RFC5095. RH0 routing header because it was deprecated by RFC5095.
o Removed text about obsolete security classification from o Removed text about obsolete security classification from
Section 5.2 "Storing PMTU information". Section 5.2 "Storing PMTU information".
o Changed title of Section 5.4 to "Packetization Layer actions" and o Changed title of Section 5.4 to "Packetization Layer actions" and
changed to text in the first paragraph to to generalize this changed to text in the first paragraph to to generalize this
section to cover all packetization layers, not just TCP. section to cover all packetization layers, not just TCP.
skipping to change at page 17, line 13 skipping to change at page 17, line 41
normal packetization layer retransmission methods. normal packetization layer retransmission methods.
o Removed text in Section 5.4 "Packetization Layer actions" that o Removed text in Section 5.4 "Packetization Layer actions" that
described 4.2 BSD because it is obsolete, and removed reference to described 4.2 BSD because it is obsolete, and removed reference to
TP4. TP4.
o Updated text in Section 5.5 "Issues for other transport protocols" o Updated text in Section 5.5 "Issues for other transport protocols"
about NFS including adding a current reference to NFS and removing about NFS including adding a current reference to NFS and removing
obsolete text. obsolete text.
o Added paragraph to Section 6 "Security Considerations" about black
hole connections if PTB messages are not received, and comparison
to PLPMTD.
o Editorial Changes. o Editorial Changes.
B.1. Change History Since RFC1981 B.1. Change History Since RFC1981
NOTE TO RFC EDITOR: Please remove this subsection prior to RFC NOTE TO RFC EDITOR: Please remove this subsection prior to RFC
Publication Publication
This section describes change history made in each Internet Draft This section describes change history made in each Internet Draft
that went into producing this version. The numbers identify the that went into producing this version. The numbers identify the
Internet-Draft version in which the change was made. Internet-Draft version in which the change was made.
Working Group Internet Drafts Working Group Internet Drafts
07) Changes from the IESG Discuss comments from IESG reviews.
The changes include:
o Added Note to Introduction that document that this
document doesn't cite RFC2119 and only uses lower case
"should/must" language. Changed all upper case "should/
must" to lower case.
o Added references for EMTU_S and EMTU_R.
o Added clarification to Section 4 "Protocol Requirements"
that nodes should detect decreases in PMTU as fast as
possible.
o Added clarification Section 5.2 "Storing PMTU information"
that nodes with multiple interface, Path MTU information
should be stored for each link.
o Removed text in Section 5.2 about Retransmission because
it was unneeded.
o Removed text in Section 5.3 about Retransmission because
it was unneeded.
o Rewrote text in Section 5.4 "Packetization Layer actions"
regarding reception to make it clearer.
o Rewrote the text at the end of Section 5.4 to remove
unnecessary details and clarify not change congestion
window.
o Added references in Section 5.5 for SCTP and added DCCP
(and reference) the list of examples.
o Added paragraph to Section 5.5 "Security Considerations"
about black hole connections if PTB messages are not
received, and comparison to PLPMTD.
07) Editorial changes.
06) Revised Appendix B "Changes since RFC1981" to have a summary 06) Revised Appendix B "Changes since RFC1981" to have a summary
of changes since RFC1981 and a separate subsection with a of changes since RFC1981 and a separate subsection with a
change history of each Internet Draft. This subsection will change history of each Internet Draft. This subsection will
be removed when the RFC is published. be removed when the RFC is published.
06) Editorial changes based on comments received after publishing 06) Editorial changes based on comments received after publishing
the -05 draft. the -05 draft.
05) Changes based on IETF last call reviews by Gorry Fairhurst, 05) Changes based on IETF last call reviews by Gorry Fairhurst,
Joe Touch, Susan Hares, Stewart Bryant, Rifaat Shekh-Yusef, Joe Touch, Susan Hares, Stewart Bryant, Rifaat Shekh-Yusef,
 End of changes. 45 change blocks. 
103 lines changed or deleted 176 lines changed or added

This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/