draft-ietf-tcpm-tcp-lcd-00.txt   draft-ietf-tcpm-tcp-lcd-01.txt 
TCP Maintenance and Minor A. Zimmermann TCP Maintenance and Minor A. Zimmermann
Extensions (TCPM) WG A. Hannemann Extensions (TCPM) WG A. Hannemann
Internet-Draft RWTH Aachen University Internet-Draft RWTH Aachen University
Intended status: Experimental November 17, 2009 Intended status: Experimental March 30, 2010
Expires: May 21, 2010 Expires: October 1, 2010
Making TCP more Robust to Long Connectivity Disruptions (TCP-LCD) Making TCP more Robust to Long Connectivity Disruptions (TCP-LCD)
draft-ietf-tcpm-tcp-lcd-00 draft-ietf-tcpm-tcp-lcd-01
Abstract Abstract
Disruptions in end-to-end path connectivity, which last longer than Disruptions in end-to-end path connectivity, which last longer than
one retransmission timeout cause suboptimal TCP performance. The one retransmission timeout, cause suboptimal TCP performance. The
reason for the performance degradation is that TCP interprets segment reason for this performance degradation is that TCP interprets
loss induced by long connectivity disruptions as a sign of segment loss induced by long connectivity disruptions as a sign of
congestion, resulting in repeated retransmission timer backoffs. congestion, resulting in repeated retransmission timer backoffs.
This leads in turn to a deferred detection of the re-establishment of This, in turn, leads to a delayed detection of the re-establishment
the connection since TCP waits until the next retransmission timeout of the connection since TCP waits for the next retransmission timeout
occurs before attempting the retransmission. before it attempts a retransmission.
This document proposes a algorithm for making TCP more robust to long This document proposes an algorithm to make TCP more robust to long
connectivity disruptions (TCP-LCD). The memo describes how standard connectivity disruptions (TCP-LCD). It describes how standard ICMP
ICMP messages can be exploited during timeout-based loss recovery to messages can be exploited during timeout-based loss recovery to
disambiguate true congestion loss from non-congestion loss caused by disambiguate true congestion loss from non-congestion loss caused by
connectivity disruptions. Moreover, a revert strategy of the connectivity disruptions. Moreover, a revert strategy of the
retransmission timer is specified that enables a more prompt retransmission timer is specified that enables a more prompt
detection of whether the connectivity to a previously disconnected detection of whether or not the connectivity to a previously
peer node has been restored or not. TCP-LCD is a TCP sender-only disconnected peer node has been restored. TCP-LCD is a TCP sender-
modification that effectively improves TCP performance in presence of only modification that effectively improves TCP performance in case
connectivity disruptions. of connectivity disruptions.
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
skipping to change at page 2, line 9 skipping to change at page 2, line 9
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on May 21, 2010. This Internet-Draft will expire on October 1, 2010.
Copyright Notice Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
skipping to change at page 3, line 18 skipping to change at page 3, line 18
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Connectivity Disruption Indication . . . . . . . . . . . . . . 6 3. Connectivity Disruption Indication . . . . . . . . . . . . . . 6
4. Connectivity Disruption Reaction . . . . . . . . . . . . . . . 8 4. Connectivity Disruption Reaction . . . . . . . . . . . . . . . 8
4.1. Basic Idea . . . . . . . . . . . . . . . . . . . . . . . . 8 4.1. Basic Idea . . . . . . . . . . . . . . . . . . . . . . . . 8
4.2. Algorithm Details . . . . . . . . . . . . . . . . . . . . 8 4.2. Algorithm Details . . . . . . . . . . . . . . . . . . . . 8
5. Discussion of TCP-LCD . . . . . . . . . . . . . . . . . . . . 11 5. Discussion of TCP-LCD . . . . . . . . . . . . . . . . . . . . 11
5.1. Retransmission Ambiguity . . . . . . . . . . . . . . . . . 12 5.1. Retransmission Ambiguity . . . . . . . . . . . . . . . . . 12
5.2. Wrapped Sequence Numbers . . . . . . . . . . . . . . . . . 13 5.2. Wrapped Sequence Numbers . . . . . . . . . . . . . . . . . 13
5.3. Packet Duplication . . . . . . . . . . . . . . . . . . . . 14 5.3. Packet Duplication . . . . . . . . . . . . . . . . . . . . 14
5.4. Probing Frequency . . . . . . . . . . . . . . . . . . . . 14 5.4. Probing Frequency . . . . . . . . . . . . . . . . . . . . 14
5.5. Reaction in Steady-State . . . . . . . . . . . . . . . . . 14 5.5. Reaction during Connection Establishment . . . . . . . . . 14
5.6. Reaction in Steady-State . . . . . . . . . . . . . . . . . 15
6. Dissolving Ambiguity Issues (the Safe Variant) . . . . . . . . 15 6. Dissolving Ambiguity Issues (the Safe Variant) . . . . . . . . 15
7. Interoperability Issues . . . . . . . . . . . . . . . . . . . 17 7. Interoperability Issues . . . . . . . . . . . . . . . . . . . 17
7.1. Detection of TCP Connection Failures . . . . . . . . . . . 17 7.1. Detection of TCP Connection Failures . . . . . . . . . . . 17
7.2. Explicit Congestion Notification . . . . . . . . . . . . . 17 7.2. Explicit Congestion Notification . . . . . . . . . . . . . 17
7.3. ICMP for IP version 6 . . . . . . . . . . . . . . . . . . 17 7.3. ICMP for IP version 6 . . . . . . . . . . . . . . . . . . 18
7.4. TCP-LCD and IP Tunnels . . . . . . . . . . . . . . . . . . 18 7.4. TCP-LCD and IP Tunnels . . . . . . . . . . . . . . . . . . 18
8. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 18 8. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 19
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20
10. Security Considerations . . . . . . . . . . . . . . . . . . . 20 10. Security Considerations . . . . . . . . . . . . . . . . . . . 20
11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 20 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21
12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21
12.1. Normative References . . . . . . . . . . . . . . . . . . . 21 12.1. Normative References . . . . . . . . . . . . . . . . . . . 21
12.2. Informative References . . . . . . . . . . . . . . . . . . 21 12.2. Informative References . . . . . . . . . . . . . . . . . . 21
Appendix A. Changes from previous versions of the draft . . . . . 23 Appendix A. Changes from previous versions of the draft . . . . . 24
A.1. Changes from draft-zimmermann-tcp-lcd-02 . . . . . . . . . 23 A.1. Changes from draft-ietf-tcpm-tcp-lcd-00 . . . . . . . . . 24
A.2. Changes from draft-zimmermann-tcp-lcd-01 . . . . . . . . . 24 A.2. Changes from draft-zimmermann-tcp-lcd-02 . . . . . . . . . 24
A.3. Changes from draft-zimmermann-tcp-lcd-00 . . . . . . . . . 24 A.3. Changes from draft-zimmermann-tcp-lcd-01 . . . . . . . . . 25
A.4. Changes from draft-zimmermann-tcp-lcd-00 . . . . . . . . . 25
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25
1. Terminology 1. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
The reader should be familiar with the algorithm and terminology from The reader should be familiar with the algorithm and terminology from
[RFC2988], which defines the standard algorithm Transmission Control [RFC2988], which defines the standard algorithm Transmission Control
skipping to change at page 4, line 43 skipping to change at page 4, line 43
interpretation of the term "timeout-based loss recovery". For interpretation of the term "timeout-based loss recovery". For
example the NewReno modification to TCP's Fast Recovery algorithm example the NewReno modification to TCP's Fast Recovery algorithm
[RFC3782] extents the period a TCP sender remains in timeout-based [RFC3782] extents the period a TCP sender remains in timeout-based
loss recovery compared to the one defined in this document. This is loss recovery compared to the one defined in this document. This is
because [RFC3782] attempts to avoid unnecessary multiple Fast because [RFC3782] attempts to avoid unnecessary multiple Fast
Retransmits that can occur after an RTO. Retransmits that can occur after an RTO.
2. Introduction 2. Introduction
Connectivity disruptions can occur in many different situations. The Connectivity disruptions can occur in many different situations. The
frequency of the connectivity disruptions depends thereby on the frequency of connectivity disruptions depends on the property of the
property of the end-to-end path between the communicating hosts. end-to-end path between the communicating hosts. While connectivity
While connectivity disruptions can occur in traditional wired disruptions can occur in traditional wired networks too, e.g., caused
networks too, e.g., simply due to an unplugged network cable, the by an unplugged network cable, the likelihood of occurrence is
likelihood of occurrence is significantly higher in wireless (multi- significantly higher in wireless (multi-hop) networks. Especially,
hop) networks. Especially, end-host mobility, network topology end-host mobility, network topology changes, and wireless
changes and wireless interferences are crucial factors. In the case interferences are crucial factors. In the case of the Transmission
of the Transmission Control Protocol (TCP) [RFC0793], the performance Control Protocol (TCP) [RFC0793], the performance of the connection
of the connection can exhibit a significant reduction compared to a can experience a significant reduction compared to a permanently
permanently connected path [SESB05]. This is because TCP, which was connected path [SESB05]. This is because TCP, which was originally
originally designed to operate in fixed and wired networks, generally designed to operate in fixed and wired networks, generally assumes
assumes that the end-to-end path connectivity is relatively stable that the end-to-end path connectivity is relatively stable over the
over the connection's lifetime. connection's lifetime.
Depending on their duration connectivity disruptions can be Depending on their duration connectivity disruptions can be
classified into two groups [I-D.schuetz-tcpm-tcp-rlci]: "short" and classified into two groups [I-D.schuetz-tcpm-tcp-rlci]: "short" and
"long" connectivity disruptions. A connectivity disruption is "long". A connectivity disruption is "short" if connectivity returns
"short" if connectivity returns before the retransmission timer fires before the retransmission timer fires for the first time. In this
for the first time. In this case, TCP recovers lost data segments case, TCP recovers lost data segments through Fast Retransmit and
through Fast Retransmit and lost acknowledgments (ACK) through lost acknowledgments (ACK) through successfully delivered later ACKs.
successfully delivered later ACKs. Connectivity disruptions are Connectivity disruptions are declared as "long" for a given TCP
declared as "long" for a given TCP connection, if the retransmission connection if the retransmission timer fires at least once before
timer fires at least once before connectivity returns. Whether or connectivity is resumed. Whether or not path characteristics, like
not path characteristics, like the round trip time (RTT) or the the round trip time (RTT) or the available bandwidth, have changed
available bandwidth have changed when the connectivity returns after when connectivity resumes after a disruption is another important
a disruption is another important aspect for TCP's retransmission aspect for TCP's retransmission scheme [I-D.schuetz-tcpm-tcp-rlci].
scheme [I-D.schuetz-tcpm-tcp-rlci].
This document improves TCP's behavior in case of "long connectivity This document improves TCP's behavior in case of "long connectivity
disruptions". In particular, it focuses on the period "prior" to the disruptions". In particular, it focuses on the period "prior" to the
re-establishment of the connectivity to a previously disconnected re-establishment of the connectivity to a previously disconnected
peer node. The document does not describe any additional peer node. The document does not describe any modifications of TCP's
modification to detect whenever the path characteristics remain behavior and its congestion control mechanisms [RFC5681] "after"
unchanged in order to improve TCP's behavior once connectivity has connectivity has been restored.
been restored. Hence, TCP's basic congestion control mechanisms
[RFC5681] will be unchanged.
When a long connectivity disruption occurs on a TCP connection, the When a long connectivity disruption occurs on a TCP connection the
TCP sender stops receiving acknowledgments. After the retransmission TCP sender eventually does not receive any more acknowledgments.
timer expires, the TCP sender enters the timeout-based loss recovery After the retransmission timer expires, the TCP sender enters the
and declares the oldest outstanding segment (SND.UNA) as lost. Since timeout-based loss recovery and declares the oldest outstanding
TCP tightly couples reliability and congestion control, the segment (SND.UNA) as lost. Since TCP tightly couples reliability and
retransmission of SND.UNA is triggered together with the reduction of congestion control, the retransmission of SND.UNA is triggered
sending rate, which is based on the assumption that segment loss is together with the reduction of the transmission rate. This is based
indication of congestion [RFC5681]. As long as the connectivity on the assumption that segment loss is an indication of congestion
disruption persists, TCP will repeat this procedure until the oldest [RFC5681]. As long as the connectivity disruption persists, TCP will
outstanding segment is successfully acknowledged, or the connection repeat this procedure until the oldest outstanding segment has
times out. TCP implementations that follow the recommended successfully been acknowledged, or until the connection has timed
retransmission timeout (RTO) management of RFC 2988 [RFC2988] double out. TCP implementations that follow the recommended retransmission
the RTO after each retransmission attempt. However, the RTO growth timeout (RTO) management of RFC 2988 [RFC2988] double the RTO after
may be bounded by an upper limit, the maximum RTO, which is at least each retransmission attempt. However, the RTO's growth may be
60s, but may be longer: Linux for example uses 120s. If the bounded by an upper limit, the maximum RTO, which is at least 60s,
connectivity is restored between two retransmission attempts, TCP but may be longer: Linux, for example, uses 120s. If connectivity is
still has to wait until the retransmission timer expires before restored between two retransmission attempts, TCP still has to wait
resuming transmission, since it simply does not have any means to until the retransmission timer expires before resuming transmission,
know when the connectivity is re-established. Therefore, depending since it simply does not have any means to know if the connectivity
on when connectivity becomes available again, this can waste up to has been re-established. Therefore, depending on when connectivity
maximum RTO of possible transmission time. becomes available again, this can waste up to a maximum RTO of
possible transmission time.
This retransmission behavior is not efficient, especially in This retransmission behavior is not efficient, especially in
scenarios where long connectivity disruptions are frequent. In the scenarios with long connectivity disruptions. In the ideal case, TCP
ideal case, a TCP would attempt a retransmission as soon as would attempt a retransmission as soon as connectivity to its peer
connectivity to its peer is re-established. In this document, we has been re-established. In this document, we specify a TCP sender-
specify a TCP sender-only modification to provide robustness to long only modification to provide robustness to long connectivity
connectivity disruptions (TCP-LCD). The memo describes how the disruptions (TCP-LCD). The memo describes how the standard Internet
standard Internet Control Message Protocol (ICMP) can be exploited Control Message Protocol (ICMP) can be exploited during timeout-based
during timeout-based loss recovery to identify non-congestion loss loss recovery to identify non-congestion loss caused by long
caused by long connectivity disruptions. TCP-LCD's revert strategy connectivity disruptions. TCP-LCD's revert strategy of the
of the retransmission timer enables, due to higher-frequency retransmission timer enables higher-frequency retransmissions and
retransmissions, a prompt detection when the connectivity to a thereby a prompt detection when connectivity to a previously
previously disconnected peer node has been restored. In the case the disconnected peer node has been restored. If no congestion is
network allows, i.e., no congestion is present, TCP-LCD approaches present, TCP-LCD approaches the ideal behavior.
the ideal behavior.
3. Connectivity Disruption Indication 3. Connectivity Disruption Indication
As long as the queue of an intermediate router experiencing a link If the queue of an intermediate router experiencing a link outage can
outage is deep enough, i.e., it can buffer all incoming packets, a buffer all incoming packets, a connectivity disruption will only
connectivity disruption will only cause variation in delay, which is cause a variation in delay, which is handled well by TCP
handled well by contemporary TCP implementations with the help of implementations using either Eifel [RFC3522], [RFC4015] or Forward
Eifel [RFC3522], [RFC4015] or Forward RTO-Recovery (F-RTO) [RFC5682]. RTO-Recovery (F-RTO) [RFC5682]. However, if the link outage lasts
However, if the link outage lasts too long, the router experiencing for too long, the router experiencing the link outage is forced to
the link outage is forced to drop packets and finally to discard the drop packets, and finally to discard the according route. Means to
according route. Means to detect such link outages comprise reacting detect such link outages include reacting on failed address
on failed address resolution protocol (ARP) [RFC0826] queries, resolution protocol (ARP) [RFC0826] queries, unsuccessful link
unsuccessful link sensing, and the like. However, this is solely in sensing, and the like. However, this is solely in the responsibility
the responsibility of the respective router. of the respective router.
Note: The focus of this memo is on introducing a method how ICMP Note: The focus of this memo is on introducing a method how ICMP
messages may be exploited to improve TCP's performance; how messages may be exploited to improve TCP's performance; how
different physical and link layer mechanisms underneath the different physical and link layer mechanisms below the network
network layer may trigger ICMP destination unreachable messages layer may trigger ICMP destination unreachable messages are out of
are out of scope of this memo. scope of this memo.
Provided that no other route (including no default route) to the Provided that no other route to the specific destination exists the
specific destination exists, the removal of the route goes along with router will notify the corresponding sending host about the dropped
a notification to the corresponding sending host about the dropped
packets via ICMP destination unreachable messages of code 0 (net packets via ICMP destination unreachable messages of code 0 (net
unreachable) or code 1 (host unreachable) [RFC1812]. Therefore, unreachable) or code 1 (host unreachable) [RFC1812]. Therefore, the
since the reception of ICMP destination unreachable messages of these sending host can use the ICMP destination unreachable messages of
codes provide evidence that packets were dropped due to a link these codes as an indication for a connectivity disruption, since the
outage, the sending host can use them as an indication for a reception of these messages provide evidence that packets were
connectivity disruption. dropped due to a link outage.
Note that there are also other ICMP destination unreachable messages Note that there are also other ICMP destination unreachable messages
with different codes. Some of them are candidates for connectivity with different codes. Some of them are candidates for connectivity
disruption indications too, but need further investigation. For disruption indications, too, but need further investigation. For
example ICMP destination unreachable messages with code 5 (source example, ICMP destination unreachable messages with code 5 (source
route failed), code 11 (net unreachable for TOS), or code 12 (host route failed), code 11 (net unreachable for TOS), or code 12 (host
unreachable for TOS) [RFC1812]. On the other hand codes that flag unreachable for TOS) [RFC1812]. On the other hand, codes that flag
hard errors are of no use for the proposed scheme, since TCP should hard errors are of no use for the proposed scheme, since TCP should
abort the connection when those are received [RFC1122]. In the abort the connection when those are received [RFC1122]. In the
following, the term "ICMP unreachable message" is used as synonym for following, the term "ICMP unreachable message" is used as synonym for
ICMP destination unreachable messages of code 0 or code 1. ICMP destination unreachable messages of code 0 or code 1.
The accurate interpretation of ICMP unreachable messages as a The accurate interpretation of ICMP unreachable messages as a
connectivity disruption indication is complicated by the following connectivity disruption indication is complicated by the following
two peculiarities of ICMP messages. Firstly, they do not necessarily two peculiarities of ICMP messages. Firstly, they do not necessarily
operate on the same timescale as the packets, i.e., in the given case operate on the same timescale as the packets, i.e., TCP segments that
TCP segments that elicited them. When a router drops a packet due to elicited them. When a router drops a packet due to a missing route
a missing route it will not necessarily send an ICMP unreachable it will not necessarily send an ICMP unreachable message immediately,
message immediately, but rather queues it for later delivery. but will rather queue it for later delivery. Secondly, ICMP messages
Secondly, ICMP messages are subject to rate limiting, e.g., when a are subject to rate limiting, e.g., when a router drops a whole
router drops a whole window of data due to a link outage, it will window of data due to a link outage, it will hardly send as many ICMP
hardly send as many ICMP unreachable messages as it dropped TCP unreachable messages as it dropped TCP segments. Depending on the
segments. Depending on the load of the router it may even send no load of the router it may even send no ICMP unreachable messages at
ICMP unreachable messages at all. Both peculiarities originate from all. Both peculiarities originate from [RFC1812].
[RFC1812].
Fortunately, according to [RFC0792] ICMP unreachable messages are Fortunately, according to [RFC0792], ICMP unreachable messages have
obliged to contain in their body the entire Internet Protocol (IP) to contain in their body the entire Internet Protocol (IP) header
header [RFC0791] of the datagram eliciting the ICMP unreachable [RFC0791] of the datagram eliciting the ICMP unreachable message,
messages plus the first 64 bits of the payload of that datagram. plus the first 64 bits of the payload of that datagram. This allows
This allows the sending host to match the ICMP error message to the the sending host to match the ICMP error message to the transport
transport that elicited it. RFC 1812 [RFC1812] augments the that elicited it. RFC 1812 [RFC1812] augments the requirements and
requirements and states that ICMP messages should contain as much of states that ICMP messages should contain as much of the original
the original datagram as possible without the length of the ICMP datagram as possible without the length of the ICMP datagram
datagram exceeding 576 bytes. Therefore, in case of TCP, at least exceeding 576 bytes. Therefore, in case of TCP, at least the source
the source port number, the destination port number, and the 32-bit port number, the destination port number, and the 32-bit TCP sequence
TCP sequence number are included. Thus, this allows the originating number are included. This allows the originating TCP to demultiplex
TCP to demultiplex the received ICMP message and to identify the the received ICMP message and to identify the faulty connection.
connection which an ICMP unreachable message is reporting an error Moreover, it can identify which segment of the respective connection
about. Moreover, it can identify which segment of the respective triggered the ICMP unreachable message, unless there are several
connection triggered the ICMP unreachable message, provided that segments in-flight with the same sequence number (see Section 5.1).
there are not several segments in-flight with the same sequence
number (see Section 5.1).
A connectivity disruption indication in form of an ICMP unreachable A connectivity disruption indication in form of an ICMP unreachable
message associated with a presumably lost TCP segment provides strong message associated with a presumably lost TCP segment provides strong
evidence that the segment was not dropped due to congestion but evidence that the segment was not dropped due to congestion, but was
instead was successfully delivered to the temporary end-point of the successfully delivered to the temporary end-point of the employed
employed path, i.e., the reporting router. It therefore did not path, i.e., the reporting router. It therefore did not witness any
witness any congestion at least on that very part of the path that congestion at least on that part of the path that was traversed by
was traveled by both, the TCP segment eliciting the ICMP unreachable both the TCP segment eliciting the ICMP unreachable message as well
message as well as the ICMP unreachable message itself. as the ICMP unreachable message itself.
4. Connectivity Disruption Reaction 4. Connectivity Disruption Reaction
Section 4.1 gives the basic idea of TCP-LCD. The complete algorithm Section 4.1 introduces the basic idea of TCP-LCD. The complete
is specified in Section 4.2. algorithm is specified in Section 4.2.
4.1. Basic Idea 4.1. Basic Idea
The goal of the algorithm is the prompt detection when the The goal of the algorithm is to promptly detect when connectivity to
connectivity to a previously disconnected peer node has been restored a previously disconnected peer node has been restored after a long
after a long connectivity disruption while retaining appropriate connectivity disruption, while retaining appropriate behavior in case
behavior in case of congestion. TCP-LCD exploits standard ICMP of congestion. TCP-LCD exploits standard ICMP unreachable messages
unreachable messages during timeout-based loss recovery to increase during timeout-based loss recovery. This increases TCP's
the TCP's retransmission frequency by undoing one retransmission retransmission frequency by undoing one retransmission timer backoff
timer backoff whenever an ICMP unreachable message reports on the whenever an ICMP unreachable message reports on the sequence number
sequence number of a presumably lost retransmission. of a presumably lost retransmission.
This approach has the advantage of appropriately reducing the probing This approach has the advantage of appropriately reducing the probing
rate in case of congestion. If either the retransmission itself, or rate in case of congestion. If either the retransmission itself, or
the corresponding ICMP message is dropped the previously performed the corresponding ICMP message, is dropped the previously performed
retransmission timer backoff is not undone, which effectively halves retransmission timer backoff is not undone, which effectively halves
the probing rate. the probing rate.
4.2. Algorithm Details 4.2. Algorithm Details
A TCP sender using RFC 2988 [RFC2988] to compute TCP's retransmission A TCP sender using RFC 2988 [RFC2988] to compute TCP's retransmission
timer MAY employ the following scheme to avoid over-conservative timer MAY employ the following scheme to avoid over-conservative
retransmission timer backoffs in case of long connectivity retransmission timer backoffs in case of long connectivity
disruptions. If a TCP sender does implement the following steps, the disruptions. If a TCP sender does implement the following steps, the
algorithm MUST be initiated upon the first timeout of the oldest algorithm MUST be initiated upon the first timeout of the oldest
outstanding segment (SND.UNA) and MUST be stopped upon the arrival of outstanding segment (SND.UNA) and MUST be stopped upon the arrival of
the first acceptable ACK. The algorithm MUST NOT be re-initiated the first acceptable ACK. The algorithm MUST NOT be re-initiated
upon subsequent timeouts for the same segment. upon subsequent timeouts for the same segment. The scheme SHOULD NOT
be used in SYN-SENT or SYN-RECEIVED states [RFC0793] (i.e., during
connection establishment).
A TCP sender that does not employ RFC 2988 [RFC2988] to compute TCP's A TCP sender that does not employ RFC 2988 [RFC2988] to compute TCP's
retransmission timer SHOULD NOT use TCP-LCD. We envision that the retransmission timer SHOULD NOT use TCP-LCD. We envision that the
scheme could be easily adapted to other algorithms than RFC 2988. scheme could be easily adapted to algorithms others than RFC 2988.
However, we leave this as future work. However, we leave this as future work.
In rule (2.5) RFC 2988 [RFC2988] provides the option to place a In rule (2.5) RFC 2988 [RFC2988] provides the option to place a
maximum value on the RTO. When a TCP implements this rule to provide maximum value on the RTO. When a TCP implements this rule to provide
an upper bound for the RTO, the rule SHOULD also be used in the an upper bound for the RTO, it SHOULD also be used in the following
following algorithm. In particular, if the RTO is bounded by an algorithm. In particular, if the RTO is bounded by an upper limit
upper limit (maximum RTO), the "MAX_RTO" variable used in this scheme (maximum RTO), the "MAX_RTO" variable used in this scheme SHOULD be
SHOULD be initialized with this upper limit. Otherwise, if the RTO initialized with this upper limit. Otherwise, if the RTO is
is unbounded, the "MAX_RTO" variable SHOULD be set to infinity. unbounded, the "MAX_RTO" variable SHOULD be set to infinity.
The scheme specified in this document uses the "BACKOFF_CNT" The scheme specified in this document uses the "BACKOFF_CNT"
variable, whose initial value is zero. The variable is used to count variable, whose initial value is zero. The variable is used to count
the number of performed retransmission timer backoffs during one the number of performed retransmission timer backoffs during one
timeout-based loss recovery. Moreover, the "RTO_BASE" variable is timeout-based loss recovery. Moreover, the "RTO_BASE" variable is
used to recover the previous RTO in case the retransmission timer used to recover the previous RTO if the retransmission timer backoff
backoff was unnecessary. The variable is initialized with the RTO was unnecessary. The variable is initialized with the RTO upon
upon initiation of timeout-based loss recovery. initiation of timeout-based loss recovery.
(1) Before TCP updates the variable "RTO" when it initiates timeout- (1) Before TCP updates the variable "RTO" when it initiates timeout-
based loss recovery, set the variables "BACKOFF_CNT" and based loss recovery, set the variables "BACKOFF_CNT" and
"RTO_BASE" as follows: "RTO_BASE" as follows:
BACKOFF_CNT := 0; BACKOFF_CNT := 0;
RTO_BASE := RTO. RTO_BASE := RTO.
Proceed to step (R). Proceed to step (R).
(R) This is a placeholder for the behavior that a standard TCP must (R) This is a placeholder for standard TCP's behavior in case the
execute at this point in case the retransmission timer is retransmission timer has expired. In particular, if RFC 2988
expired. In particular if RFC 2988 [RFC2988] is used, steps [RFC2988] is used, steps (5.4) - (5.6) of that algorithm go
(5.4) - (5.6) of that algorithm go here. Proceed to step (2). here. Proceed to step (2).
(2) To account for the expiration of the retransmission timer in the (2) To account for the expiration of the retransmission timer in the
previous step (R), increment the "BACKOFF_CNT" variable by one: previous step (R), increment the "BACKOFF_CNT" variable by one:
BACKOFF_CNT := BACKOFF_CNT + 1. BACKOFF_CNT := BACKOFF_CNT + 1.
(3) Wait either (3) Wait either
for the expiration of the retransmission timer. When the for the expiration of the retransmission timer. When the
retransmission timer expires, proceed to step (R); retransmission timer expires, proceed to step (R);
skipping to change at page 10, line 35 skipping to change at page 10, line 34
(8) If the retransmission timer expires due to the undoing in the (8) If the retransmission timer expires due to the undoing in the
previous step (7), then previous step (7), then
proceed to step (R); proceed to step (R);
else else
proceed to step (3). proceed to step (3).
(A) This is a placeholder for the standard TCP behavior that must be (A) This is a placeholder for standard TCP's behavior in case an
executed at this point in the case an acceptable ACK has acceptable ACK has arrived. No further processing.
arrived. No further processing.
When a TCP in steady-state detects a segment loss using the When a TCP in steady-state detects a segment loss using the
retransmission timer it enters the timeout-based loss recovery and retransmission timer it enters the timeout-based loss recovery and
initiates the algorithm (step 1). It adjusts the slow start initiates the algorithm (step 1). It adjusts the slow start
threshold (ssthresh), sets the congestion window (CWND) to one threshold (ssthresh), sets the congestion window (CWND) to one
segment, backs off the retransmission timer and retransmits the first segment, backs off the retransmission timer, and retransmits the
unacknowledged segment (step R) [RFC5681], [RFC2988]. To account for first unacknowledged segment (step R) [RFC5681], [RFC2988]. To
the expiration of the retransmission timer the TCP sender increments account for the expiration of the retransmission timer the TCP sender
the "BACKOFF_CNT" variable by one (step 2). increments the "BACKOFF_CNT" variable by one (step 2).
In case the retransmission timer expires again (step 3a) a TCP will In case the retransmission timer expires again (step 3a) a TCP will
repeat the retransmission of the first unacknowledged segment and repeat the retransmission of the first unacknowledged segment and
back off the retransmission timer once more (step R) [RFC2988] as back off the retransmission timer once more (step R) [RFC2988] as
well as increment the "BACKOFF_CNT" variable by one (step 2). Note well as increment the "BACKOFF_CNT" variable by one (step 2). Note
that a TCP may implement RFC 2988's [RFC2988] option to place a that a TCP may implement RFC 2988's [RFC2988] option to place a
maximum value on the RTO that may result in not performing the maximum value on the RTO that may result in not performing the
retransmission timer backoff. However, step (2) MUST always and retransmission timer backoff. However, step (2) MUST always and
unconditionally be applied, no matter whether the retransmission unconditionally be applied, no matter whether or not the
timer is actually backed off or not. In other words, each time the retransmission timer is actually backed off. In other words, each
retransmission timer expires, the "BACKOFF_CNT" variable MUST be time the retransmission timer expires, the "BACKOFF_CNT" variable
incremented by one. MUST be incremented by one.
If the first received packet after the retransmission(s) is an If the first received packet after the retransmission(s) is an
acceptable ACK (step 3b), a TCP will proceed as normal, i.e., slow acceptable ACK (step 3b), a TCP will proceed as normal, i.e., slow
start the connection and terminate the algorithm (step A). Later start the connection and terminate the algorithm (step A). Later
ICMP unreachable messages from the just terminated timeout-based loss ICMP unreachable messages from the just terminated timeout-based loss
recovery are of no use and therefore ignored since the ACK clock is recovery are ignored since the ACK clock is already restarting due to
already restarting due to the successful retransmission. the successful retransmission.
On the other hand, if the first received packet after the On the other hand, if the first received packet after the
retransmission(s) is an ICMP unreachable message (step 3c) and if retransmission(s) is an ICMP unreachable message (step 3c), and if
step (4) allows, a TCP SHOULD undo one backoff for each ICMP step (4) permits it, a TCP SHOULD undo one backoff for each ICMP
unreachable message reporting an error on a retransmission. To unreachable message reporting an error on a retransmission. To
decide if an ICMP unreachable message reports on a retransmission, decide if an ICMP unreachable message reports on a retransmission,
the sequence number therein is exploited (step 5, step 6). The undo the sequence number therein is exploited (step 5, step 6). The undo
is performed by re-calculating the RTO with the decremented is performed by re-calculating the RTO with the decremented
"BACKOFF_CNT" variable (step 7). This calculation explicitly matches "BACKOFF_CNT" variable (step 7). This calculation explicitly matches
the (bounded) exponential backoff specified in rule (5.5) of the (bounded) exponential backoff specified in rule (5.5) of
[RFC2988]. [RFC2988].
Upon receipt of an ICMP unreachable message that legitimately undoes Upon receipt of an ICMP unreachable message that legitimately undoes
one backoff there is the possibility that the shortened one backoff there is the possibility that the shortened
retransmission timer has expired already (step 8). Then, a TCP retransmission timer has already expired (step 8). Then, a TCP
SHOULD retransmit immediately, i.e., an ICMP message clocked SHOULD retransmit immediately, i.e., an ICMP message clocked
retransmission. In case the shortened retransmission timer has not retransmission. In case the shortened retransmission timer has not
expired yet, TCP MUST wait accordingly. yet expired, TCP MUST wait accordingly.
5. Discussion of TCP-LCD 5. Discussion of TCP-LCD
TCP-LCD takes caution to only react to connectivity disruption TCP-LCD takes caution to only react to connectivity disruption
indications in form of ICMP unreachable messages during timeout-based indications in form of ICMP unreachable messages during timeout-based
loss recovery. Therefore, TCP's behavior is not altered when either loss recovery. Therefore, TCP's behavior is not altered when either
no ICMP unreachable messages are received, or the retransmission no ICMP unreachable messages are received, or the retransmission
timer of the TCP sender did not yet expire since the last received timer of the TCP sender did not expire since the last received
acceptable ACK. Thereby, the algorithm triggers by definition only acceptable ACK. Thus, by defintion the algorithm triggers only in
in the case of long connectivity disruptions. case of long connectivity disruptions.
Only such ICMP unreachable messages that report on the sequence Only such ICMP unreachable messages that report on the sequence
number of a retransmission, i.e., report on SND.UNA, are evaluated by number of a retransmission, i.e., report on SND.UNA, are evaluated by
TCP-LCD. All other ICMP unreachable messages are ignored. The TCP-LCD. All other ICMP unreachable messages are ignored. The
arrival of those ICMP unreachable messages provides strong evidence arrival of those ICMP unreachable messages provides strong evidence
that the retransmissions were not dropped due to congestion but that the retransmissions were not dropped due to congestion but were
instead were successfully delivered to the temporary end-point of the successfully delivered to the temporary end-point of the employed
employed path, i.e., the reporting router. In other words, there is path, i.e., the reporting router. In other words, there is no
no witness for any congestion at least on that very part of the path evidence for any congestion at least on that very part of the path
that was traveled by both, the TCP segment eliciting the ICMP that was traveled by both, the TCP segment eliciting the ICMP
unreachable message as well as the ICMP unreachable message itself. unreachable message as well as the ICMP unreachable message itself.
However, there are some situations where TCP-LCD makes a false However, there are some situations where TCP-LCD makes a false
decision and undoes a retransmission timer backoff wrongly. This can decision and incorrectly undoes a retransmission timer backoff. This
happen, albeit the received ICMP unreachable message reports on the can happen, albeit the received ICMP unreachable message reports on
segment number of a retransmission (SND.UNA), because the TCP segment the segment number of a retransmission (SND.UNA) because the TCP
that elicited the ICMP unreachable message may either not be a segment that elicited the ICMP unreachable message may either not be
retransmission (Section 5.1), or does not belong to the current a retransmission (Section 5.1), or does not belong to the current
timeout-based loss recovery (Section 5.2). Finally, packet timeout-based loss recovery (Section 5.2). Finally, packet
duplication (Section 5.3) can also spuriously trigger the algorithm. duplication (Section 5.3) can also spuriously trigger the algorithm.
Section 5.4 discusses possible probing frequencies, while Section 5.5 Section 5.4 discusses possible probing frequencies, while Section 5.6
describes the motivation for not reacting on ICMP unreachable describes the motivation for not reacting on ICMP unreachable
messages while TCP is in steady-state. messages while TCP is in steady-state.
5.1. Retransmission Ambiguity 5.1. Retransmission Ambiguity
Historically, the retransmission ambiguity problem [Zh86], [KP87] is Historically, the retransmission ambiguity problem [Zh86], [KP87] is
the TCP sender's inability to distinguish whether the first the TCP sender's inability to distinguish whether the first
acceptable ACK after a retransmission refers to the original acceptable ACK after a retransmission refers to the original
transmission or the retransmission. This problem occurs after both a transmission or to the retransmission. This problem occurs after
Fast Retransmit and a timeout-based retransmit. However, modern TCP both a Fast Retransmit and a timeout-based retransmit. However,
implementations can eliminate the retransmission ambiguity with modern TCP implementations can eliminate the retransmission ambiguity
either the help of Eifel [RFC3522], [RFC4015] or Forward RTO-Recovery with either the help of Eifel [RFC3522], [RFC4015] or Forward RTO-
(F-RTO) [RFC5682]. Recovery (F-RTO) [RFC5682].
The revert strategy of the given algorithm suffers from a form of The revert strategy of the given algorithm suffers from a form of
retransmission ambiguity, too. In contrast to the aforementioned retransmission ambiguity, too. In contrast to the above case, TCP
case, TCP suffers from ambiguity regarding ICMP unreachable messages suffers from ambiguity regarding ICMP unreachable messages received
received during timeout-based loss recovery. With the TCP segment during timeout-based loss recovery. With the TCP segment number
number included in the ICMP unreachable message, a TCP sender is not included in the ICMP unreachable message, a TCP sender is not able to
able to determine if the ICMP unreachable message refers to the determine if the ICMP unreachable message refers to the original
original transmission or to any of the timeout-based retransmissions. transmission or to any of the timeout-based retransmissions. That
That is, there is an ambiguity which TCP segment, i.e., the original is, there is an ambiguity which TCP segment an ICMP unreachable
transmission or any of the retransmissions an ICMP unreachable
message reports on. message reports on.
However, for the algorithm the ambiguity is not considered to be a However, for the algorithm this ambiguity is not considered to be a
problem. The assumption that a received ICMP message provides problem. The assumption that a received ICMP message provides
evidence that one non-congestion loss caused by the connectivity evidence that a non-congestion loss caused by the connectivity
disruption was wrongly considered a congestion loss still holds, disruption was wrongly considered a congestion loss still holds,
regardless to which TCP segment, transmission or retransmission the regardless to which TCP segment, transmission or retransmission, the
message refers. message refers.
5.2. Wrapped Sequence Numbers 5.2. Wrapped Sequence Numbers
Besides the ambiguity if a received ICMP unreachable message refers Besides the ambiguity whether a received ICMP unreachable message
to the original transmission or to any of the retransmissions, there refers to the original transmission or to any of the retransmissions,
is another source of ambiguity about the TCP sequence numbers there is another source of ambiguity about the TCP sequence numbers
contained in ICMP unreachable messages. For high bandwidth paths contained in ICMP unreachable messages. For high bandwidth paths
like modern gigabit links the sequence space may wrap rather quickly, like modern gigabit links the sequence space may wrap rather quickly,
thereby allowing the possibility that delayed ICMP unreachable thereby allowing the possibility that delayed ICMP unreachable
messages - a router dropping packets due to a link outage is not messages - a router dropping packets due to a link outage is not
obliged to send ICMP unreachable messages in a timely manner obliged to send ICMP unreachable messages in a timely manner
[RFC1812] - may coincidentally fit as valid input in the proposed [RFC1812] - may coincidentally fit as valid input in the proposed
scheme. As a result, the scheme may undo retransmission timer scheme. As a result, the scheme may incorrectly undo retransmission
backoffs wrongly. Chances for this to happen are minuscule, since a timer backoffs. Chances for this to happen are minuscule, since a
particular ICMP message would need to contain the exact sequence particular ICMP message would need to contain the exact sequence
number of the current oldest outstanding segment (SND.UNA), while at number of the current oldest outstanding segment (SND.UNA), while at
the same time TCP is in timeout-based loss recovery. However, two the same time TCP is in timeout-based loss recovery. However, two
"worst case" scenarios for the algorithm are possible: "worst case" scenarios for the algorithm are possible:
For instance, consider a steady state TCP connection, which will be For instance, consider a steady state TCP connection, which will be
disrupted at an intermediate router R due to a link outage. Upon the disrupted at an intermediate router R due to a link outage. Upon the
expiration of the RTO, the TCP sender enters the timeout-based loss expiration of the RTO, the TCP sender enters the timeout-based loss
recovery and starts to retransmit the earliest segment that has not recovery and starts to retransmit the earliest segment that has not
been acknowledged (SND.UNA). For any reason, router R delays all been acknowledged (SND.UNA). For some reason, router R delays all
corresponding ICMP unreachable messages, so that the TCP sender corresponding ICMP unreachable messages so that the TCP sender
backoffs the retransmission timer normally without any undoing. At backoffs the retransmission timer normally without any undoing. At
the end of the connectivity disruption, the TCP sender eventually the end of the connectivity disruption, the TCP sender eventually
detects the re-establishment, leaves the scheme and finally the detects the re-establishment, leaves the scheme and finally the
timeout-based loss recovery, too. A sequence number wrap-around timeout-based loss recovery, too. A sequence number wrap-around
later, the connectivity between the two peers is disrupted again, but later, the connectivity between the two peers is disrupted again, but
this time due to congestion and exactly at the time at which the this time due to congestion and exactly at the time at which the
current SND.UNA matches the SND.UNA from the previous cycle. If current SND.UNA matches the SND.UNA from the previous cycle. If
router R emits the delayed ICMP unreachable messages now, the TCP router R emits the delayed ICMP unreachable messages now, the TCP
sender would undo retransmission timer backoffs wrongly. As the TCP sender would incorrectly undo retransmission timer backoffs. As the
sequence number contains 32 bits, the probability of this scenario is TCP sequence number contains 32 bits, the probability of this
at most 1/2^32. Given sufficiently many retransmissions in the first scenario is at most 1/2^32. Given sufficiently many retransmissions
timeout-based loss recovery, the corresponding ICMP unreachable in the first timeout-based loss recovery, the corresponding ICMP
messages could reduce the RTO in the second recovery at most to unreachable messages could reduce the RTO in the second recovery at
"RTO_BASE". However, once the ICMP unreachable messages are most to "RTO_BASE". However, once the ICMP unreachable messages are
depleted, the standard exponential backoff will be performed. Thus, depleted, the standard exponential backoff will be performed. Thus,
the congestion response will only be delayed by some false the congestion response will only be delayed by some false
retransmissions. retransmissions.
Similar to the above, consider the case where a steady state TCP Similar to the above, consider the case where a steady state TCP
connection with n segments in-flight will be disrupted at some point connection with n segments in-flight will be disrupted at some point
by an intermediate router R due to a link outage. For each segment due to a link outage by an intermediate router R. For each segment
in-flight, router R may generates an ICMP unreachable message, in-flight, router R may generate an ICMP unreachable message.
however due to some reason it delays them. Once the link outage is However, due to some reason it delays them. Once the link outage is
over and the connection is re-established, the TCP sender leaves the over and the connection has been re-established, the TCP sender
scheme and slow-starts the connection. Following a sequence number leaves the scheme and slow-starts the connection. Following a
wrap-around, a retransmission timeout occurs, just at the moment the sequence number wrap-around, a retransmission timeout occurs, just at
TCP sender's current window of data reaches the previous range of the the moment the TCP sender's current window of data reaches the
sequence number space again. In case router R emits the delayed ICMP previous range of the sequence number space again. In case router R
unreachable messages now, one spurious undoing of the retransmission emits the delayed ICMP unreachable messages now, one spurious undoing
timer backoff is possible, if firstly the TCP segment number of the retransmission timer backoff is possible, if the TCP segment
contained in ICMP unreachable messages matches the current SND.UNA, number contained in ICMP unreachable messages matches the current
and secondly the timeout was a result of congestion. In the case of SND.UNA, and the timeout was a result of congestion. In the case of
another connectivity disruption, the additional undoing of the another connectivity disruption, the additional undoing of the
retransmission timer backoff has no impact. The probability of this retransmission timer backoff has no impact. The probability of this
scenario is at most n/2^32. scenario is at most n/2^32.
5.3. Packet Duplication 5.3. Packet Duplication
In the case an intermediate router duplicates packets, a TCP sender In case an intermediate router duplicates packets, a TCP sender may
may receive more ICMP unreachable messages during timeout-based loss receive more ICMP unreachable messages during timeout-based loss
recovery than it actually has sent timeout-based retransmissions. recovery than it actually has sent timeout-based retransmissions.
However, since TCP-LCD keeps track of the number of performed However, since TCP-LCD keeps track of the number of performed
retransmission timer backoffs in the "BACKOFF_CNT" variable, it will retransmission timer backoffs in the "BACKOFF_CNT" variable, it will
not undo more retransmission timer backoffs than were actually not undo more retransmission timer backoffs than were actually
performed. Nevertheless, if packet duplication and congestion performed. Nevertheless, if packet duplication and congestion
coincide on the path between the two communicating hosts, duplicated coincide on the path between the two communicating hosts, duplicated
ICMP messages could hide the congestion loss of some retransmissions ICMP messages could hide the congestion loss of some retransmissions
or ICMP messages and the algorithm may undo retransmission timer or ICMP messages, and the algorithm may incorrectly undo
backoffs wrongly. Considering the overall impact of a router that retransmission timer backoffs. Considering the overall impact of a
duplicates packets, the additional load induced by some spurious router that duplicates packets, the additional load induced by some
timeout-based retransmits can probably be neglected. spurious timeout-based retransmits can probably be neglected.
5.4. Probing Frequency 5.4. Probing Frequency
One could argue that if an ICMP unreachable message arrives for a One could argue that if an ICMP unreachable message arrives for a
timeout-based retransmission, the RTO should be reset or recalculated timeout-based retransmission, the RTO shall be reset or recalculated,
similar to what is done when an ACK arrives during timeout-based loss similar to what is done when an ACK arrives during timeout-based loss
recovery (see Karn's algorithm [KP87], [RFC2988]), and a new recovery (see Karn's algorithm [KP87], [RFC2988]), and a new
retransmission should be sent immediately. Generally, this would retransmission should be sent immediately. Generally, this would
allow for a much higher probing frequency based on the round trip allow for a much higher probing frequency based on the round trip
time up to the router where the connectivity is disrupted. However, time up to the router where connectivity has been disrupted.
we believe the current scheme provides a good trade-off between However, we believe the current scheme provides a good trade-off
conservative behavior and fast detection of connectivity re- between conservative behavior and fast detection of connectivity re-
establishment. establishment.
5.5. Reaction in Steady-State 5.5. Reaction during Connection Establishment
It is possible that a TCP sender enters timeout-based loss recovery
while the connection is in SYN-SENT or SYN-RECEIVED states [RFC0793].
The algorithm described in this document could also be used for
faster connection establishment in networks with connectivity
disruptions. However, because existing TCP implementations [RFC5461]
already interpret ICMP unreachable messages during connection
establishment and abort the corresponding connection, we refrain from
suggesting this.
5.6. Reaction in Steady-State
Another exploitation of ICMP unreachable messages in the context of Another exploitation of ICMP unreachable messages in the context of
TCP congestion control might seem appropriate in case the ICMP TCP congestion control might seem appropriate in case the ICMP
unreachable message is received while TCP is in steady-state and the unreachable message is received while TCP is in steady-state, and the
message refers to a segment from within the current window of data. message refers to a segment from within the current window of data.
As the RTT up to the router, which generates the ICMP unreachable As the RTT up to the router that generated the ICMP unreachable
message is likely to be substantially shorter than the overall RTT to message is likely to be substantially shorter than the overall RTT to
the destination, the ICMP unreachable message may very well reach the the destination, the ICMP unreachable message may very well reach the
originating TCP while it is transmitting the current window of data. originating TCP while it is transmitting the current window of data.
In case the remaining window is large, it might seem appropriate to In case the remaining window is large, it might seem appropriate to
refrain from transmitting the remaining window as there is timely refrain from transmitting the remaining window as there is timely
evidence that it will only trigger further ICMP unreachable messages evidence that it will only trigger further ICMP unreachable messages
at the very router. Although this promises improvement from a at the very router. Although this promises improvement from a
wastage perspective, it may be counterproductive from a security wastage perspective, it may be counterproductive from a security
perspective. An attacker could forge such ICMP messages, thereby perspective. An attacker could forge such ICMP messages, thereby
forcing the originating TCP to stop sending data, very similar to the forcing the originating TCP to stop sending data, very similar to the
blind throughput-reduction attack mentioned in blind throughput-reduction attack mentioned in
[I-D.ietf-tcpm-icmp-attacks]. [I-D.ietf-tcpm-icmp-attacks].
An additional consideration is the following: in the presence of An additional consideration is the following: in the presence of
multi-path routing even the receipt of a legitimate ICMP unreachable multi-path routing even the receipt of a legitimate ICMP unreachable
message cannot be exploited accurately because there is the option message cannot be exploited accurately because there is the option
that only one of the multiple paths to the destination is suffering that only one of the multiple paths to the destination is suffering
from a connectivity disruption, which causes ICMP unreachable from a connectivity disruption, which causes ICMP unreachable
messages to be sent. Then however, there is the possibility that the messages to be sent. Then, however, there is the possibility that
path along which the connectivity disruption occurred contributed the path along which the connectivity disruption occurred contributed
considerably to the overall bandwidth, such that a congestion considerably to the overall bandwidth, such that a congestion
response is very well reasonable. However, this is not necessarily response is very well reasonable. However, this is not necessarily
the case. Therefore, a TCP has no means except for its inherent the case. Therefore, a TCP has no means except for its inherent
congestion control to decide on this matter. All in all, it seems congestion control to decide on this matter. All in all, it seems
that for a connection in steady-state, i.e., not in timeout-based that for a connection in steady-state, i.e., not in timeout-based
loss recovery, reacting on ICMP unreachable messages in regard to loss recovery, reacting on ICMP unreachable messages in regard to
congestion control is not appropriate. For the case of timeout-based congestion control is not appropriate. For the case of timeout-based
retransmissions, however, there is a reasonable congestion response, retransmissions, however, there is a reasonable congestion response,
which is skipping further retransmission timer backoffs because there which is skipping further retransmission timer backoffs because there
is no congestion indication - as described above. is no congestion indication - as described above.
6. Dissolving Ambiguity Issues (the Safe Variant) 6. Dissolving Ambiguity Issues (the Safe Variant)
Given that the TCP Timestamps option [I-D.ietf-tcpm-1323bis] is Given that the TCP Timestamps option [RFC1323] is enabled for a
enabled for a connection, a TCP sender MAY use the following connection, a TCP sender MAY use the following algorithm to dissolve
algorithm to dissolve the ambiguity issues mentioned in Sections 5.1, the ambiguity issues mentioned in Sections 5.1, 5.2, and 5.3. In
5.2, and 5.3. In particular both the retransmission ambiguity and particular, both the retransmission ambiguity and the packet
the packet duplication problems are prevented by the following TCP- duplication problems are prevented by the following TCP-LCD variant.
LCD variant. On the other hand, the false positives caused by On the other hand, the false positives caused by wrapped sequence
wrapped sequence numbers can not be completely avoided, but the numbers cannot be completely avoided, but the likelihood is further
likelihood is further reduced by a factor of 1/2^32 since the reduced by a factor of 1/2^32 since the Timestamp Value field (TSval)
Timestamp Value field (TSval) of the TCP Timestamps Option contains of the TCP Timestamps Option contains 32 bits.
32 bits.
Hence, implementers may choose to implement the TCP-LCD with the Hence, implementers may choose to implement the TCP-LCD with the
following modifications. following modifications.
Step (1) is replaced by step (1'): Step (1) is replaced by step (1'):
(1') Before TCP updates the variable "RTO" when it initiates (1') Before TCP updates the variable "RTO" when it initiates
timeout-based loss recovery, set the variables "BACKOFF_CNT" timeout-based loss recovery, set the variables "BACKOFF_CNT"
and "RTO_BASE" and the data structure "RETRANS_TS" as follows: and "RTO_BASE" and the data structure "RETRANS_TS" as follows:
skipping to change at page 16, line 29 skipping to change at page 16, line 38
(2b) Store the value of the Timestamp Value field (TSval) of the TCP (2b) Store the value of the Timestamp Value field (TSval) of the TCP
Timestamps option included in the retransmission "RET" sent in Timestamps option included in the retransmission "RET" sent in
step (R) into the "RETRANS_TS" data structure: step (R) into the "RETRANS_TS" data structure:
RETRANS_TS.add(RET.TSval) RETRANS_TS.add(RET.TSval)
Step (6) is replaced by step (6'): Step (6) is replaced by step (6'):
(6') If "SEG.SEQ == SND.UNA && RETRANS_TS.exists(SEQ.TSval)", i.e., (6') If "SEG.SEQ == SND.UNA && RETRANS_TS.exists(SEQ.TSval)", i.e.,
if the TCP segment "SEG" eliciting the ICMP unreachable message if the TCP segment "SEG" eliciting the ICMP unreachable message
"ICMP_DU" carries the sequence number of a retransmission and "ICMP_DU" carries the sequence number of a retransmission, and
the value in its Timestamp Value field (TSval) is valid, then the value in its Timestamp Value field (TSval) is valid, then
proceed to step (7'); proceed to step (7');
else else
proceed to step (3). proceed to step (3).
Step (7) is replaced by step (7'): Step (7) is replaced by step (7'):
(7') Undo the last retransmission timer backoff: (7') Undo the last retransmission timer backoff:
RETRANS_TS.remove(SEQ.TSval); RETRANS_TS.remove(SEQ.TSval);
BACKOFF_CNT := BACKOFF_CNT - 1; BACKOFF_CNT := BACKOFF_CNT - 1;
RTO := min(RTO_BASE * 2^(BACKOFF_CNT), MAX_RTO). RTO := min(RTO_BASE * 2^(BACKOFF_CNT), MAX_RTO).
The downside of the safe variant is twofold. Firstly, the The downside of the safe variant is twofold. Firstly, the
modifications come at a cost: the TCP sender is required to store the modifications come at a cost: the TCP sender is required to store the
timestamps of all retransmissions sent during one timeout-based loss timestamps of all retransmissions sent during one timeout-based loss
recovery. Secondly, the safe variant can only undo a retransmission recovery. Second, the safe variant can only undo a retransmission
timer backoff, if the intermediate router experiencing the link timer backoff if the intermediate router experiencing the link outage
outage implements [RFC1812] and chooses to include as many more than implements [RFC1812] and chooses to include as many more than the
the first 64 bits of the payload of the triggering datagram, as are first 64 bits of the payload of the triggering datagram, as are
needed to include the TCP Timestamps option in the ICMP unreachable needed to include the TCP Timestamps option in the ICMP unreachable
message. message.
7. Interoperability Issues 7. Interoperability Issues
This section discusses interoperability issues related to introducing This section discusses interoperability issues related to introducing
TCP-LCD. TCP-LCD.
7.1. Detection of TCP Connection Failures 7.1. Detection of TCP Connection Failures
TCP-LCD may have side-effects on TCP implementations, which attempt TCP-LCD may have side-effects on TCP implementations that attempt to
to detect TCP connection failures by counting timeout-based detect TCP connection failures by counting timeout-based
retransmissions. RFC 1122 [RFC1122] states in Section 4.2.3.5 that a retransmissions. RFC 1122 [RFC1122] states in Section 4.2.3.5 that a
TCP host must handle excessive retransmissions of data segments with TCP host must handle excessive retransmissions of data segments with
two thresholds R1 and R2 measuring the amount of retransmission that two thresholds R1 and R2 measuring the number of retransmissions that
has occurred for the same segment. Both thresholds might either be have occurred for the same segment. Both thresholds might either be
measured in time units or as a count of retransmissions. measured in time units or as a count of retransmissions.
Due to TCP-LCD's revert strategy of the retransmission timer, the Due to TCP-LCD's revert strategy of the retransmission timer, the
assumption that a certain number of retransmissions corresponds to a assumption that a certain number of retransmissions corresponds to a
specific time interval no longer holds true, as additional specific time interval no longer holds, as additional retransmissions
retransmissions may be performed during timeout-based-loss recovery may be performed during timeout-based-loss recovery to detect the end
to detect the end of the connectivity disruption. Therefore, a TCP of the connectivity disruption. Therefore, a TCP employing TCP-LCD
employing TCP-LCD either SHOULD measure the thresholds R1 and R2 in either SHOULD measure the thresholds R1 and R2 in time units or, in
time units or in case R1 and R2 are counters of retransmissions case R1 and R2 are counters of retransmissions, SHOULD convert them
SHOULD convert them into time intervals, which correspond to the time into time intervals, which correspond to the time an unmodified TCP
an unmodified TCP would need to reach the specified number of would need to reach the specified number of retransmissions.
retransmissions.
7.2. Explicit Congestion Notification 7.2. Explicit Congestion Notification
By the use of Explicit Congestion Notification (ECN) [RFC3168] ECN- By the use of Explicit Congestion Notification (ECN) [RFC3168] ECN-
capable routers are no longer limited to dropping packets as capable routers are no longer limited to dropping packets as
congestion indication. Instead they can set the Congestion congestion indication. Instead, they can set the Congestion
Experienced (CE) codepoint in the IP header of packets to indicate Experienced (CE) codepoint in the IP header to indicate congestion.
congestion. Concerning TCP-LCD there is the option that during a
connectivity disruption a received ICMP unreachable message has been With TCP-LCD it may happen that during a connectivity disruption a
elicited by a timeout-based retransmission that was marked with the received ICMP unreachable message has been elicited by a timeout-
CE codepoint before reaching the router experiencing the link outage. based retransmission that was marked with the CE codepoint before
In such a case, we suggest in the case the algorithm undoes a reaching the router experiencing the link outage. In such a case, we
retransmission timer backoff, the TCP sender SHOULD additionally suggest that the TCP sender SHOULD additionally reset the
reset the retransmission timer. retransmission timer in case the algorithm undoes a retransmission
timer backoff.
7.3. ICMP for IP version 6 7.3. ICMP for IP version 6
RFC 4443 [RFC4443] specifies the Internet Control Message Protocol RFC 4443 [RFC4443] specifies the Internet Control Message Protocol
(ICMPv6) to be used with the Internet Protocol version 6 (IPv6) (ICMPv6) to be used with the Internet Protocol version 6 (IPv6)
[RFC2460]. From TCP-LCD's point of view, it is important to notice [RFC2460]. From TCP-LCD's point of view, it is important to notice
that for IPv6, the payload of an ICMPv6 error messages has to include that for IPv6, the payload of an ICMPv6 error messages has to include
as many bytes from the IPv6 datagram that elicited the ICMPv6 error as many bytes as possible from the IPv6 datagram that elicited the
message as possible without making the error message exceed the ICMPv6 error message, without making the error message exceed the
minimum IPv6 MTU (1280 bytes) [RFC4443]. Thus, more information is minimum IPv6 MTU (1280 bytes) [RFC4443]. Thus, more information is
available for TCP-LCD as in the case of IPv4. available for TCP-LCD as in the case of IPv4.
The counterpart of the ICMPv4 destination unreachable message of code The counterpart of the ICMPv4 destination unreachable message of code
0 (net unreachable) and of code 1 (host unreachable) is the ICMPv6 0 (net unreachable) and of code 1 (host unreachable) is the ICMPv6
destination unreachable message of code 0 (no route to destination) destination unreachable message of code 0 (no route to destination)
[RFC4443]. Like the IPv4 case, a router should generate an ICMPv6 [RFC4443]. As with IPv4, a router should generate an ICMPv6
destination unreachable message of code 0 in response to a packet destination unreachable message of code 0 in response to a packet
that cannot be delivered to its destination address because it lacks that cannot be delivered to its destination address because it lacks
a matching entry in its routing table. As a result, TCP-LCD can a matching entry in its routing table. As a result, TCP-LCD can
employ this ICMPv6 error messages as connectivity disruption employ this ICMPv6 error messages as connectivity disruption
indication, too. indication, too.
7.4. TCP-LCD and IP Tunnels 7.4. TCP-LCD and IP Tunnels
It is worth noting that IP tunnels, including IPsec [RFC4301], IP in It is worth noting that IP tunnels, including IPsec [RFC4301], IP in
IP [RFC2003], Generic Routing Encapsulation (GRE) [RFC2784], and IP [RFC2003], Generic Routing Encapsulation (GRE) [RFC2784], and
others are compatible with TCP-LCD, as long as the received ICMP others are compatible with TCP-LCD, as long as the received ICMP
unreachable messages can be demultiplexed and extracted appropriately unreachable messages can be demultiplexed and extracted appropriately
by the TCP sender during timeout-based loss recovery. by the TCP sender during timeout-based loss recovery.
If for example end-to-end tunnels like IPSec in transport mode If, for example, end-to-end tunnels like IPSec in transport mode
[RFC4301] are employed, a TCP sender may receive ICMP unreachable [RFC4301] are employed, a TCP sender may receive ICMP unreachable
messages, where additional steps, e.g., decrypting in step (5) of the messages where additional steps, e.g., decrypting in step (5) of the
algorithm is needed to extract the TCP header from these ICMP algorithm, are needed to extract the TCP header from these ICMP
messages. Provided that the received ICMP unreachable message messages. Provided that the received ICMP unreachable message
contains enough information, i.e., SEQ.SEG is extractable, these contains enough information, i.e., SEQ.SEG is extractable, these
information MAY still be used as a valid input for the proposed information MAY still be used as a valid input for the proposed
algorithm. algorithm.
Likewise, if IP encapsulation like [RFC2003] is used in some part of Likewise, if IP encapsulation like [RFC2003] is used in some part of
the path between the communicating hosts, instead of the TCP sender, the path between the communicating hosts, the tunnel ingress node may
the tunnel ingress node may receive the ICMP unreachable messages receive the ICMP unreachable messages from an intermediate router
from an intermediate router experiencing the link outage. experiencing the link outage. Nevertheless, the tunnel ingress node
Nevertheless, the tunnel ingress node may replay the ICMP unreachable may replay the ICMP unreachable messages in order to inform the TCP
messages in order to inform the TCP sender. If enough information is sender. If enough information is preserved to extract SEQ.SEG, the
preserved to extract SEQ.SEG, the replayed ICMP unreachable messages replayed ICMP unreachable messages MAY still be used in TCP-LCD.
MAY still be used in TCP-LCD.
8. Related Work 8. Related Work
In literature there are several methods that address TCP's problems Several methods that address TCP's problems in the presence of
in the presence of connectivity disruptions. Some of them try to connectivity disruptions have been proposed in literature. Some of
improve TCP's performance by modifying lower layers. For example them try to improve TCP's performance by modifying lower layers. For
[SM03] introduces a "smart link layer", which buffers one segment for example [SM03] introduces a "smart link layer", which buffers one
each active connection and replaying these segments on connectivity segment for each active connection and replays these segments upon
re-establishment. This approach has a serious drawback: previously connectivity re-establishment. This approach has a serious drawback:
stateless intermediate routers have to be modified in order to previously stateless intermediate routers have to be modified in
inspect TCP headers, to track the end-to-end connection and to order to inspect TCP headers, to track the end-to-end connection, and
provide additional buffer space. These lead all in all to an to provide additional buffer space. This leads to an additional need
additional need of memory and processing power. of memory and processing power.
On the other hand stateless link layer schemes, like proposed in On the other hand, stateless link layer schemes, as proposed in
[RFC3819], which unconditionally buffer some small number of packets [RFC3819], which unconditionally buffer some small number of packets
may have another problem: if a packet is buffered longer than the may have another problem: if a packet is buffered longer than the
maximum segment lifetime (MSL) of 2 min [RFC0793], i.e., the maximum segment lifetime (MSL) of 2 min [RFC0793], i.e., the
disconnection lasts longer than MSL, TCP's assumption that such disconnection lasts longer than MSL, TCP's assumption that such
segments will never be received will no longer be true, violating segments will never be received will no longer be true, violating
TCP's semantics [I-D.eggert-tcpm-tcp-retransmit-now]. TCP's semantics [I-D.eggert-tcpm-tcp-retransmit-now].
Other approaches like TCP-F [CRVP01] or the Explicit Link Failure Other approaches, like TCP-F [CRVP01] or the Explicit Link Failure
Notification (ELFN) [HV02] inform a TCP sender about a disrupted path Notification (ELFN) [HV02] inform a TCP sender about a disrupted path
by special messages generated and sent from intermediate routers. In by special messages generated and sent from intermediate routers. In
case of a link failure the TCP sender stops sending segments and case of a link failure the TCP sender stops sending segments and
freezes its retransmission timers. TCP-F stays in this state and freezes its retransmission timers. TCP-F stays in this state and
remains silent until either a "route establishment notification" is remains silent until either a "route establishment notification" is
received or an internal timer expires. In contrast, ELFN received or an internal timer expires. In contrast, ELFN
periodically probes the network to detect connectivity re- periodically probes the network to detect connectivity re-
establishment. Both proposals rely on changes to intermediate establishment. Both proposals rely on changes to intermediate
routers, whereas the scheme proposed in this document is a sender- routers, whereas the scheme proposed in this document is a sender-
only modification. Moreover, ELFN does not consider congestion and only modification. Moreover, ELFN does not consider congestion and
may impose serious additional load on the network, depending on the may impose serious additional load on the network, depending on the
probe interval. probe interval.
The authors of ATCP [LS01] propose enhancements to identify different The authors of ATCP [LS01] propose enhancements to identify different
types of packet loss by introducing a layer between TCP and IP. They types of packet loss by introducing a layer between TCP and IP. They
utilize ICMP destination unreachable messages to set TCP's receiver utilize ICMP destination unreachable messages to set TCP's receiver
advertised window to zero and thus forcing the TCP sender to perform advertised window to zero, thus forcing the TCP sender to perform
zero window probing with a exponential backoff. ICMP destination zero window probing with a exponential backoff. ICMP destination
unreachable messages, which arrive during this probing period, are unreachable messages that arrive during this probing period are
ignored. This approach is nearly orthogonal to this document, which ignored. This approach is nearly orthogonal to this document, which
exploits ICMP messages to undo a retransmission timer backoff when exploits ICMP messages to undo a retransmission timer backoff when
TCP is already probing. In principle both mechanisms could be TCP is already probing. In principle, both mechanisms could be
combined, however, due to security considerations it does not seem combined. However, due to security considerations it does not seem
appropriate to adopt ATCP's reaction as discussed in Section 5.5. appropriate to adopt ATCP's reaction as discussed in Section 5.6.
Schuetz et al. describe in [I-D.schuetz-tcpm-tcp-rlci] a set of TCP Schuetz et al. describe, in [I-D.schuetz-tcpm-tcp-rlci], a set of TCP
extensions that improve TCP's behavior when transmitting over paths extensions that improve TCP's behavior when transmitting over paths
whose characteristics can change on short time-scales. Their whose characteristics can change rapidly. Their proposed extensions
proposed extensions modify the local behavior of TCP and introduce a modify the local behavior of TCP and introduce a new TCP option to
new TCP option to signal locally received connectivity-change signal locally received connectivity-change indications (CCIs) to
indications (CCIs) to remote peers. Upon reception of a CCI, they remote peers. Upon receipt of a CCI, they re-probe the path
re-probe the path characteristics either by performing a speculative characteristics either by performing a speculative retransmission or
retransmission or by sending a single segment of new data, depending by sending a single segment of new data, depending on whether the
on whether the connection is currently stalled in exponential backoff connection is currently stalled in exponential backoff or
or transmitting in steady-state, respectively. The authors focus on transmitting in steady-state, respectively. The authors focus on
specifying TCP response mechanisms, nevertheless underlying layers specifying TCP response mechanisms, nevertheless underlying layers
would have to be modified to explicitly send CCIs to make these would have to be modified to explicitly send CCIs to make these
immediate responses possible. immediate responses possible.
9. IANA Considerations 9. IANA Considerations
This memo includes no request to IANA. This memo includes no request to IANA.
10. Security Considerations 10. Security Considerations
The algorithm proposed in this document is considered to be secure. The algorithm proposed in this document is considered to be secure.
For example an attacker, who already guessed the correct four-tuple For example, an attacker who already guessed the correct four-tuple
(i.e., Source IP Address, Source TCP port, Destination IP Address, (i.e., Source IP Address, Source TCP port, Destination IP Address,
and Destination TCP port), can still not make a TCP modified with and Destination TCP port), can still not make a TCP modified with
TCP-LCD to flood the network just by sending forged ICMP unreachable TCP-LCD to flood the network just by sending forged ICMP unreachable
messages in an attempt to maliciously shorten the retransmission messages in an attempt to maliciously shorten the retransmission
timer. The attacker additionally would need to guess the correct timer. The attacker additionally would need to guess the correct
segment sequence number of the current timeout-based retransmission, segment sequence number of the current timeout-based retransmission,
with a probability of at most 1/2^32. Even in the case of man-in- with a probability of at most 1/2^32. Even in the case of man-in-
the-middle attacks, i.e., attacks performed in scenarios in which the the-middle attacks, i.e., attacks performed in scenarios in which the
attacker can sniff the retransmissions, the impact on network load is attacker can sniff the retransmissions, the impact on network load is
considered to be low, since the retransmission frequency is limited considered to be low, since the retransmission frequency is limited
by the RTO that was computed before TCP has entered the timeout-based by the RTO that was computed before TCP had entered the timeout-based
loss recovery. Hence, the highest probing frequency is expected to loss recovery. Hence, the highest probing frequency is expected to
be even lower than once per minimum RTO, i.e. 1s as specified by be even lower than once per minimum RTO, i.e. 1s as specified by
[RFC2988]. [RFC2988].
11. Acknowledgments 11. Acknowledgments
We would like to thank Ilpo Jarvinen, Pasi Sarolahti, Timothy We would like to thank Kai Jakobs, Ilpo Jarvinen, Pasi Sarolahti,
Shepard, Joe Touch and Carsten Wolff for feedback on earlier versions Timothy Shepard, Joe Touch and Carsten Wolff for feedback on earlier
of this document. We also thank Michael Faber, Daniel Schaffrath, versions of this document. We also thank Michael Faber, Daniel
and Damian Lukowski for implementing and testing the algorithm in Schaffrath, and Damian Lukowski for implementing and testing the
Linux. Special thanks go to Ilpo Jarvinen for giving valuable algorithm in Linux. Special thanks go to Ilpo Jarvinen for giving
feedback regarding the Linux implementation. valuable feedback regarding the Linux implementation.
This work has been supported by the German National Science This work has been supported by the German National Science
Foundation (DFG) within the research excellence cluster Ultra High- Foundation (DFG) within the research excellence cluster Ultra High-
Speed Mobile Information and Communication (UMIC), RWTH Aachen Speed Mobile Information and Communication (UMIC), RWTH Aachen
University. University.
12. References 12. References
12.1. Normative References 12.1. Normative References
[I-D.ietf-tcpm-1323bis]
Borman, D., Braden, R., and V. Jacobson, "TCP Extensions
for High Performance", draft-ietf-tcpm-1323bis-01 (work in
progress), March 2009.
[RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5,
RFC 792, September 1981. RFC 792, September 1981.
[RFC0793] Postel, J., "Transmission Control Protocol", STD 7, [RFC0793] Postel, J., "Transmission Control Protocol", STD 7,
RFC 793, September 1981. RFC 793, September 1981.
[RFC1323] Jacobson, V., Braden, B., and D. Borman, "TCP Extensions
for High Performance", RFC 1323, May 1992.
[RFC1812] Baker, F., "Requirements for IP Version 4 Routers", [RFC1812] Baker, F., "Requirements for IP Version 4 Routers",
RFC 1812, June 1995. RFC 1812, June 1995.
[RFC2988] Paxson, V. and M. Allman, "Computing TCP's Retransmission [RFC2988] Paxson, V. and M. Allman, "Computing TCP's Retransmission
Timer", RFC 2988, November 2000. Timer", RFC 2988, November 2000.
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
Control", RFC 5681, September 2009. Control", RFC 5681, September 2009.
12.2. Informative References 12.2. Informative References
skipping to change at page 21, line 47 skipping to change at page 22, line 12
over mobile ad hoc networks", Wireless Networks vol. 8, over mobile ad hoc networks", Wireless Networks vol. 8,
no. 2-3, pp. 275-288, March 2002. no. 2-3, pp. 275-288, March 2002.
[I-D.eggert-tcpm-tcp-retransmit-now] [I-D.eggert-tcpm-tcp-retransmit-now]
Eggert, L., "TCP Extensions for Immediate Eggert, L., "TCP Extensions for Immediate
Retransmissions", draft-eggert-tcpm-tcp-retransmit-now-02 Retransmissions", draft-eggert-tcpm-tcp-retransmit-now-02
(work in progress), June 2005. (work in progress), June 2005.
[I-D.ietf-tcpm-icmp-attacks] [I-D.ietf-tcpm-icmp-attacks]
Gont, F., "ICMP attacks against TCP", Gont, F., "ICMP attacks against TCP",
draft-ietf-tcpm-icmp-attacks-06 (work in progress), draft-ietf-tcpm-icmp-attacks-12 (work in progress),
August 2009. March 2010.
[I-D.schuetz-tcpm-tcp-rlci] [I-D.schuetz-tcpm-tcp-rlci]
Schuetz, S., Koutsianas, N., Eggert, L., Eddy, W., Swami, Schuetz, S., Koutsianas, N., Eggert, L., Eddy, W., Swami,
Y., and K. Le, "TCP Response to Lower-Layer Connectivity- Y., and K. Le, "TCP Response to Lower-Layer Connectivity-
Change Indications", draft-schuetz-tcpm-tcp-rlci-03 (work Change Indications", draft-schuetz-tcpm-tcp-rlci-03 (work
in progress), February 2008. in progress), February 2008.
[KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time [KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time
Estimates in Reliable Transport Protocols", Proceedings of Estimates in Reliable Transport Protocols", Proceedings of
the Conference on Applications, Technologies, the Conference on Applications, Technologies,
skipping to change at page 23, line 20 skipping to change at page 23, line 32
[RFC4015] Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm [RFC4015] Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm
for TCP", RFC 4015, February 2005. for TCP", RFC 4015, February 2005.
[RFC4301] Kent, S. and K. Seo, "Security Architecture for the [RFC4301] Kent, S. and K. Seo, "Security Architecture for the
Internet Protocol", RFC 4301, December 2005. Internet Protocol", RFC 4301, December 2005.
[RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control [RFC4443] Conta, A., Deering, S., and M. Gupta, "Internet Control
Message Protocol (ICMPv6) for the Internet Protocol Message Protocol (ICMPv6) for the Internet Protocol
Version 6 (IPv6) Specification", RFC 4443, March 2006. Version 6 (IPv6) Specification", RFC 4443, March 2006.
[RFC5461] Gont, F., "TCP's Reaction to Soft Errors", RFC 5461,
February 2009.
[RFC5682] Sarolahti, P., Kojo, M., Yamamoto, K., and M. Hata, [RFC5682] Sarolahti, P., Kojo, M., Yamamoto, K., and M. Hata,
"Forward RTO-Recovery (F-RTO): An Algorithm for Detecting "Forward RTO-Recovery (F-RTO): An Algorithm for Detecting
Spurious Retransmission Timeouts with TCP", RFC 5682, Spurious Retransmission Timeouts with TCP", RFC 5682,
September 2009. September 2009.
[SESB05] Schuetz, S., Eggert, L., Schmid, S., and M. Brunner, [SESB05] Schuetz, S., Eggert, L., Schmid, S., and M. Brunner,
"Protocol enhancements for intermittently connected "Protocol enhancements for intermittently connected
hosts", SIGCOMM Computer Communication Review vol. 35, no. hosts", SIGCOMM Computer Communication Review vol. 35, no.
3, pp. 5-18, December 2005. 3, pp. 5-18, December 2005.
skipping to change at page 23, line 42 skipping to change at page 24, line 9
Communication Review vol. 33, no. 5, pp. 31-42, Communication Review vol. 33, no. 5, pp. 31-42,
October 2003. October 2003.
[Zh86] Zhang, L., "Why TCP Timers Don't Work Well", Proceedings [Zh86] Zhang, L., "Why TCP Timers Don't Work Well", Proceedings
of the Conference on Applications, Technologies, of the Conference on Applications, Technologies,
Architectures, and Protocols for Computer Communication Architectures, and Protocols for Computer Communication
(SIGCOMM'86) pp. 397-405, August 1986. (SIGCOMM'86) pp. 397-405, August 1986.
Appendix A. Changes from previous versions of the draft Appendix A. Changes from previous versions of the draft
A.1. Changes from draft-zimmermann-tcp-lcd-02 A.1. Changes from draft-ietf-tcpm-tcp-lcd-00
o Editorial changes.
o Clarified TCP-LCD's behaviour during connection establishment
(Thanks to Mark Handley).
A.2. Changes from draft-zimmermann-tcp-lcd-02
o Incorporated feedback submitted by Ilpo Jarvinen. o Incorporated feedback submitted by Ilpo Jarvinen.
<http://www.ietf.org/mail-archive/web/tcpm/current/msg04841.html> <http://www.ietf.org/mail-archive/web/tcpm/current/msg04841.html>
o Incorporated feedback submitted by Pasi Sarolahti. o Incorporated feedback submitted by Pasi Sarolahti.
<http://www.ietf.org/mail-archive/web/tcpm/current/msg04870.html> <http://www.ietf.org/mail-archive/web/tcpm/current/msg04870.html>
o Incorporated feedback submitted by Joe Touch. o Incorporated feedback submitted by Joe Touch.
<http://www.ietf.org/mail-archive/web/tcpm/current/msg04895.html> <http://www.ietf.org/mail-archive/web/tcpm/current/msg04895.html>
<http://www.ietf.org/mail-archive/web/tcpm/current/msg04900.html> <http://www.ietf.org/mail-archive/web/tcpm/current/msg04900.html>
skipping to change at page 24, line 31 skipping to change at page 25, line 8
subsection anymore. Moreover, the section was renamed to subsection anymore. Moreover, the section was renamed to
"Dissolving Ambiguity Issues" and has now real content. "Dissolving Ambiguity Issues" and has now real content.
o An interoperability issues section (Section 7) was added. In o An interoperability issues section (Section 7) was added. In
particular comments to ECN, ICMPv6, and to the two thresholds R1 particular comments to ECN, ICMPv6, and to the two thresholds R1
and R2 of [RFC1122] (Section 4.2.3.5) were added. and R2 of [RFC1122] (Section 4.2.3.5) were added.
o Miscellaneous editorial changes. In particular, the algorithm has o Miscellaneous editorial changes. In particular, the algorithm has
a name now: TCP-LCD. a name now: TCP-LCD.
A.2. Changes from draft-zimmermann-tcp-lcd-01 A.3. Changes from draft-zimmermann-tcp-lcd-01
o The algorithm in Section 4.2 was slightly changed. Instead of o The algorithm in Section 4.2 was slightly changed. Instead of
reverting the last retransmission timer backoff by halving the reverting the last retransmission timer backoff by halving the
RTO, the RTO is recalculated with help of the "BACKOFF_CNT" RTO, the RTO is recalculated with help of the "BACKOFF_CNT"
variable. This fixes an issue that occurred when the variable. This fixes an issue that occurred when the
retransmission timer was backed off but bounded by a maximum retransmission timer was backed off but bounded by a maximum
value. The algorithm in the previous version of the draft, would value. The algorithm in the previous version of the draft, would
have "reverted" to half of that maximum value, instead of using have "reverted" to half of that maximum value, instead of using
the value, before the RTO was doubled (and then bounded). the value, before the RTO was doubled (and then bounded).
o Miscellaneous editorial changes. o Miscellaneous editorial changes.
A.3. Changes from draft-zimmermann-tcp-lcd-00 A.4. Changes from draft-zimmermann-tcp-lcd-00
o Miscellaneous editorial changes in Section 1, 2 and 3. o Miscellaneous editorial changes in Section 1, 2 and 3.
o The document was restructured in Section 1, 2 and 3 for easier o The document was restructured in Section 1, 2 and 3 for easier
reading. The motivation for the algorithm is changed according reading. The motivation for the algorithm is changed according
TCP's problem to disambiguate congestion from non-congestion loss. TCP's problem to disambiguate congestion from non-congestion loss.
o Added Section 4.1. o Added Section 4.1.
o The algorithm in Section 4.2 was restructured and simplified: o The algorithm in Section 4.2 was restructured and simplified:
 End of changes. 95 change blocks. 
352 lines changed or deleted 363 lines changed or added

This html diff was produced by rfcdiff 1.38. The latest version is available from http://tools.ietf.org/tools/rfcdiff/