draft-ietf-tcpm-ecnsyn-03.txt   draft-ietf-tcpm-ecnsyn-04.txt 
Internet Engineering Task Force A. Kuzmanovic Internet Engineering Task Force A. Kuzmanovic
INTERNET-DRAFT A. Mondal INTERNET-DRAFT A. Mondal
Intended status: Proposed Standard Northwestern University Intended status: Proposed Standard Northwestern University
Expires: 18 May 2008 S. Floyd Expires: 8 July 2008 S. Floyd
ICIR ICIR
K.K. Ramakrishnan K.K. Ramakrishnan
AT&T AT&T
18 November 2007 8 January 2008
Adding Explicit Congestion Notification (ECN) Capability Adding Explicit Congestion Notification (ECN) Capability
to TCP's SYN/ACK Packets to TCP's SYN/ACK Packets
draft-ietf-tcpm-ecnsyn-03.txt draft-ietf-tcpm-ecnsyn-04.txt
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 2, line 22 skipping to change at page 2, line 22
timeout, this document specifies the use of ECN for the SYN/ACK timeout, this document specifies the use of ECN for the SYN/ACK
packet itself, when sent in response to a SYN packet with the two ECN packet itself, when sent in response to a SYN packet with the two ECN
flags set in the TCP header, indicating a willingness to use ECN. flags set in the TCP header, indicating a willingness to use ECN.
Setting TCP SYN/ACK packets as ECN-Capable can be of great benefit to Setting TCP SYN/ACK packets as ECN-Capable can be of great benefit to
the TCP connection, avoiding the severe penalty of a retransmit the TCP connection, avoiding the severe penalty of a retransmit
timeout for a connection that has not yet started placing a load on timeout for a connection that has not yet started placing a load on
the network. The sender of the SYN/ACK packet must respond to a the network. The sender of the SYN/ACK packet must respond to a
report of an ECN-marked SYN/ACK packet by reducing its initial report of an ECN-marked SYN/ACK packet by reducing its initial
congestion window from two, three, or four segments to one segment, congestion window from two, three, or four segments to one segment,
thereby reducing the subsequent load from that connection on the thereby reducing the subsequent load from that connection on the
network. network. This document is intended to update RFC 3168.
Table of Contents Table of Contents
1. Introduction ....................................................4 1. Introduction ....................................................4
2. Conventions .....................................................5 2. Conventions and Terminology .....................................6
3. Proposal ........................................................6 3. Proposal ........................................................6
4. Discussion ......................................................9 4. Discussion ......................................................9
5. Related Work ...................................................12 5. Related Work ...................................................12
6. Performance Evaluation .........................................13 6. Performance Evaluation .........................................13
6.1. The Costs and Benefit of Adding ECN-Capability ............13 6.1. The Costs and Benefit of Adding ECN-Capability ............13
6.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK 6.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK
Packets ........................................................14 Packets ........................................................14
7. Security Considerations ........................................15 7. Security Considerations ........................................15
8. Conclusions ....................................................16 8. Conclusions ....................................................16
9. Acknowledgements ...............................................17 9. Acknowledgements ...............................................17
A. Report on Simulations ..........................................17 A. Report on Simulations ..........................................17
A.1. Simulations with RED in Packet Mode .......................18 A.1. Simulations with RED in Packet Mode .......................17
A.2. Simulations with RED in Byte Mode .........................19 A.2. Simulations with RED in Byte Mode .........................19
Normative References ..............................................20 B. Issues of Incremental Deployment ...............................20
Informative References ............................................20 Normative References ..............................................23
IANA Considerations ...............................................22 Informative References ............................................23
Full Copyright Statement ..........................................22 IANA Considerations ...............................................24
Intellectual Property .............................................23 Full Copyright Statement ..........................................25
Intellectual Property .............................................25
NOTE TO RFC EDITOR: PLEASE DELETE THIS NOTE UPON PUBLICATION. NOTE TO RFC EDITOR: PLEASE DELETE THIS NOTE UPON PUBLICATION.
Changes from draft-ietf-tcpm-ecnsyn-03:
* General editing. This includes using the terms "initiator"
and "responder" for the two ends of the TCP connection.
Feedback from Alfred Hoenes.
* Added some text to the backwards compatibility discussion,
now in Appendix B, about the pros and cons of using a TCP
flag for the TCP initiator to signal that it understands
ECN-Capable SYN/ACK packets. The consensus at this time is
not to use such a flag. Also added a recommendation that
TCP implementations include a management interface to turn
off the use of ECN for SYN/ACK packets. From email from
Bob Briscoe.
Changes from draft-ietf-tcpm-ecnsyn-02: Changes from draft-ietf-tcpm-ecnsyn-02:
* Added to the discussion in the Security section of whether * Added to the discussion in the Security section of whether
ECN-Capable TCP SYN packets have problems with firewalls, ECN-Capable TCP SYN packets have problems with firewalls,
over and above the known problems of TCP data packets over and above the known problems of TCP data packets
(e.g., as in the Microsoft report). From a question raised (e.g., as in the Microsoft report). From a question raised
at the TCPM meeting at the July 2007 IETF. at the TCPM meeting at the July 2007 IETF.
* Added a sentence to the discussion of routers or middleboxes that * Added a sentence to the discussion of routers or middleboxes that
*might* drop TCP SYN packets on the basis of IP header fields. *might* drop TCP SYN packets on the basis of IP header fields.
Feedback from Remi Denis-Courmont. Feedback from Remi Denis-Courmont.
* General editing. Feedback from Alfred Henes. * General editing. Feedback from Alfred Hoenes.
Changes from draft-ietf-tcpm-ecnsyn-01: Changes from draft-ietf-tcpm-ecnsyn-01:
* Changes in response to feedback from Anil Agarwal. * Changes in response to feedback from Anil Agarwal.
* Added a look at the costs of adding ECN-Capability to * Added a look at the costs of adding ECN-Capability to
SYN/ACKs in a highly-congested scenario. SYN/ACKs in a highly-congested scenario.
From feedback from Mark Allman and Janardhan Iyengar. From feedback from Mark Allman and Janardhan Iyengar.
* Added a comparative evaluation of two possible responses * Added a comparative evaluation of two possible responses
skipping to change at page 5, line 8 skipping to change at page 5, line 24
congestion while avoiding unnecessary retransmissions and, in some congestion while avoiding unnecessary retransmissions and, in some
cases, unnecessary retransmit timeouts. Thus, using ECN has several cases, unnecessary retransmit timeouts. Thus, using ECN has several
benefits: benefits:
1) For short transfers, a TCP connection's congestion window may be 1) For short transfers, a TCP connection's congestion window may be
small. For example, if the current window contains only one packet, small. For example, if the current window contains only one packet,
and that packet is dropped, TCP will have to wait for a retransmit and that packet is dropped, TCP will have to wait for a retransmit
timeout to recover, reducing its overall throughput. Similarly, if timeout to recover, reducing its overall throughput. Similarly, if
the current window contains only a few packets and one of those the current window contains only a few packets and one of those
packets is dropped, there might not be enough duplicate packets is dropped, there might not be enough duplicate
acknowledgements for a fast retransmission, and the sender might have acknowledgements for a fast retransmission, and the sender of the
to wait for a delay of several round-trip times using Limited data packet might have to wait for a delay of several round-trip
Transmit [RFC3042]. With the use of ECN, short flows are less likely times using Limited Transmit [RFC3042]. With the use of ECN, short
to have packets dropped, sometimes avoiding unnecessary delays or flows are less likely to have packets dropped, sometimes avoiding
costly retransit timeouts. unnecessary delays or costly retransmit timeouts.
2) While longer flows may not see substantially improved throughput 2) While longer flows may not see substantially improved throughput
with the use of ECN, they experience lower loss. This may benefit TCP with the use of ECN, they experience lower loss. This may benefit TCP
applications that are latency- and loss-sensitive, because of the applications that are latency- and loss-sensitive, because of the
avoidance of retransmissions. avoidance of retransmissions.
RFC 3168 only specifies marking the Congestion Experienced codepoint RFC 3168 only specifies marking the Congestion Experienced codepoint
on TCP's data packets, and not on SYN and SYN/ACK packets. RFC 3168 on TCP's data packets, and not on SYN and SYN/ACK packets. RFC 3168
specifies the negotiation of the use of ECN between the two TCP end- specifies the negotiation of the use of ECN between the two TCP end-
points in the TCP SYN and SYN-ACK exchange, using flags in the TCP points in the TCP SYN and SYN-ACK exchange, using flags in the TCP
skipping to change at page 5, line 47 skipping to change at page 6, line 16
benefits: benefits:
1) Avoidance of a retransmit timeout; 1) Avoidance of a retransmit timeout;
2) Improvement in the throughput of short connections. 2) Improvement in the throughput of short connections.
This draft specifies ECN+, a modification to RFC 3168 to allow TCP This draft specifies ECN+, a modification to RFC 3168 to allow TCP
SYN/ACK packets to be ECN-Capable. Section 3 contains the SYN/ACK packets to be ECN-Capable. Section 3 contains the
specification of the change, while Section 4 discusses some of the specification of the change, while Section 4 discusses some of the
issues, and Section 5 discusses related work. Section 6 contains an issues, and Section 5 discusses related work. Section 6 contains an
evaluation of the proposed change. evaluation of the proposed change.
2. Conventions 2. Conventions and Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC 2119]. document are to be interpreted as described in [RFC 2119].
3. Proposal We use the following terminology from RFC 3168:
This section specifies the modification to RFC 3168 to allow TCP
SYN/ACK packets to be ECN-Capable. We use the following terminology
from RFC 3168:
The ECN field in the IP header: The ECN field in the IP header:
o CE: the Congestion Experienced codepoint; and o CE: the Congestion Experienced codepoint; and
o ECT: either one of the two ECN-Capable Transport codepoints. o ECT: either one of the two ECN-Capable Transport codepoints.
The ECN flags in the TCP header: The ECN flags in the TCP header:
o CWR: the Congestion Window Reduced flag; and o CWR: the Congestion Window Reduced flag; and
o ECE: the ECN-Echo flag. o ECE: the ECN-Echo flag.
ECN-setup packets: ECN-setup packets:
o ECN-setup SYN packet: a SYN packet with the ECE and CWR flags; o ECN-setup SYN packet: a SYN packet with the ECE and CWR flags;
o ECN-setup SYN-ACK packet: a SYN-ACK packet with ECE but not CWR. o ECN-setup SYN-ACK packet: a SYN-ACK packet with ECE but not CWR.
In this document we use the terms "initiator" and "responder" to
refer to the sender of the SYN packet and of the SYN-ACK packet,
respectively.
3. Proposal
This section specifies the modification to RFC 3168 to allow TCP
SYN/ACK packets to be ECN-Capable.
RFC 3168 in Section 6.1.1. states that "A host MUST NOT set ECT on RFC 3168 in Section 6.1.1. states that "A host MUST NOT set ECT on
SYN or SYN-ACK packets." In this section, we specify that a TCP node SYN or SYN-ACK packets." In this section, we specify that a TCP node
MAY respond to an ECN-setup SYN packet by setting ECT in the MAY respond to an ECN-setup SYN packet by setting ECT in the
responding ECN-setup SYN/ACK packet, indicating to routers that the responding ECN-setup SYN/ACK packet, indicating to routers that the
SYN/ACK packet is ECN-Capable. This allows a congested router along SYN/ACK packet is ECN-Capable. This allows a congested router along
the path to mark the packet instead of dropping the packet as an the path to mark the packet instead of dropping the packet as an
indication of congestion. indication of congestion.
Assume that TCP node A transmits to TCP node B an ECN-setup SYN Assume that TCP node A transmits to TCP node B an ECN-setup SYN
packet, indicating willingness to use ECN for this connection. As packet, indicating willingness to use ECN for this connection. As
skipping to change at page 7, line 27 skipping to change at page 7, line 36
3-second timer expires 3-second timer expires
<--- ECN-setup SYN/ACK, not ECT <--- ECN-setup SYN/ACK, not ECT
<--- ECN-setup SYN/ACK <--- ECN-setup SYN/ACK
Data/ACK ---> Data/ACK --->
Data/ACK ---> Data/ACK --->
<--- Data (one to four segments) <--- Data (one to four segments)
--------------------------------------------------------------- ---------------------------------------------------------------
Figure 1: SYN exchange with the SYN/ACK packet dropped. Figure 1: SYN exchange with the SYN/ACK packet dropped.
If the SYN/ACK packet is dropped in the network, the TCP host (node If the SYN/ACK packet is dropped in the network, the responder (node
B) responds by waiting three seconds for the retransmit timer to B) responds by waiting three seconds for the retransmit timer to
expire [RFC2988]. If a SYN/ACK packet with the ECT codepoint is expire [RFC2988]. If a SYN/ACK packet with the ECT codepoint is
dropped, the TCP node SHOULD resend the SYN/ACK packet without the dropped, the responder SHOULD resend the SYN/ACK packet without the
ECN-Capable codepoint. (Although we are not aware of any middleboxes ECN-Capable codepoint. (Although we are not aware of any middleboxes
that drop SYN/ACK packets that contain an ECN-Capable codepoint in that drop SYN/ACK packets that contain an ECN-Capable codepoint in
the IP header, we have learned to design our protocols defensively in the IP header, we have learned to design our protocols defensively in
this regard [RFC3360].) this regard [RFC3360].)
We note that if syn-cookies were used by Node B in the exchange in We note that if syn-cookies were used by the responder (node B) in
Figure 1, TCP Node B wouldn't set a timer upon transmission of the the exchange in Figure 1, the responder wouldn't set a timer upon
SYN/ACK packet [SYN-COOK]. In this case, if the SYN/ACK packet was transmission of the SYN/ACK packet [SYN-COOK]. In this case, if the
lost, the initiator (Node A) would have to timeout and retransmit the SYN/ACK packet was lost, the initiator (Node A) would have to timeout
SYN packet in order to trigger another SYN-ACK. and retransmit the SYN packet in order to trigger another SYN-ACK.
Figure 2 shows an interchange with the SYN/ACK packet sent as ECN- Figure 2 shows an interchange with the SYN/ACK packet sent as ECN-
Capable, and ECN-marked instead of dropped at the congested router. Capable, and ECN-marked instead of dropped at the congested router.
--------------------------------------------------------------- ---------------------------------------------------------------
TCP Node A Router TCP Node B TCP Node A Router TCP Node B
---------- ------ ---------- ---------- ------ ----------
ECN-setup SYN packet ---> ECN-setup SYN packet --->
ECN-setup SYN packet ---> ECN-setup SYN packet --->
skipping to change at page 8, line 24 skipping to change at page 8, line 27
<--- ECN-setup SYN/ACK, CE <--- ECN-setup SYN/ACK, CE
Data/ACK, ECN-Echo ---> Data/ACK, ECN-Echo --->
Data/ACK, ECN-Echo ---> Data/ACK, ECN-Echo --->
Window reduced to one segment. Window reduced to one segment.
<--- Data, CWR (one segment only) <--- Data, CWR (one segment only)
--------------------------------------------------------------- ---------------------------------------------------------------
Figure 2: SYN exchange with the SYN/ACK packet marked. Figure 2: SYN exchange with the SYN/ACK packet marked.
If the receiving node (node A) receives a SYN/ACK packet that has If the initiator (node A) receives a SYN/ACK packet that has been
been marked by the congested router, with the CE codepoint set, the marked by the congested router, with the CE codepoint set, the
receiving node MUST respond by setting the ECN-Echo flag in the TCP initiator MUST respond by setting the ECN-Echo flag in the TCP header
header of the responding ACK packet. As specified in RFC 3168, the of the responding ACK packet. As specified in RFC 3168, the
receiving node continues to set the ECN-Echo flag in packets until it initiator continues to set the ECN-Echo flag in packets until it
receives a packet with the CWR flag set. receives a packet with the CWR flag set.
When the sending node (node B) receives the ECN-Echo packet reporting When the responder (node B) receives the ECN-Echo packet reporting
the Congestion Experienced indication in the SYN/ACK packet, the node the Congestion Experienced indication in the SYN/ACK packet, the
MUST set the initial congestion window to one segment, instead of two responder MUST set the initial congestion window to one segment,
segments as allowed by [RFC2581], or three or four segments allowed instead of two segments as allowed by [RFC2581], or three or four
by [RFC3390]. If the sending node (node B) was going to use an segments allowed by [RFC3390]. If the responder (node B) was going
initial window of one segment, and receives an ECN-Echo packet to use an initial window of one segment, and receives an ECN-Echo
informing it of a Congestion Experienced indication on its SYN/ACK packet informing it of a Congestion Experienced indication on its
packet, the sending node MAY continue to send with an initial window SYN/ACK packet, the responder MAY continue to send with an initial
of one segment, without waiting for a retransmit timeout. We note window of one segment, without waiting for a retransmit timeout. We
that this updates RFC 3168, which specifies that "the sending TCP note that this updates RFC 3168, which specifies that "the sending
MUST reset the retransmit timer on receiving the ECN-Echo packet when TCP MUST reset the retransmit timer on receiving the ECN-Echo packet
the congestion window is one." As specified by RFC 3168, the sending when the congestion window is one." As specified by RFC 3168, the
node (node B) also sets the CWR flag in the TCP header of the next responder (node B) also sets the CWR flag in the TCP header of the
data packet sent, to acknowledge its receipt of and reaction to the next data packet sent, to acknowledge its receipt of and reaction to
ECN-Echo flag. the ECN-Echo flag.
If the data transfer in Figure 2 is entirely from Node A to Node B, If the data transfer in Figure 2 is entirely from Node A to Node B,
then data packets from Node A continue to set the ECN-Echo flag in then data packets from Node A continue to set the ECN-Echo flag in
data packets, waiting for the CWR flag from Node B acknowledging a data packets, waiting for the CWR flag from Node B acknowledging a
response to the ECN-Echo flag. response to the ECN-Echo flag.
The TCP implementation using ECN-Capable SYN/ACK packets SHOULD
include a management interface to allow the use of ECN to be turned
off for SYN/ACK packets. This is to deal with possible backwards
compatibility problems such as those discussed in Appendix B.
4. Discussion 4. Discussion
Motivation: Motivation:
The rationale for the proposed change is the following. When node B The rationale for the proposed change is the following. When node B
receives a TCP SYN packet with ECN-Echo bit set in the TCP header, receives a TCP SYN packet with ECN-Echo bit set in the TCP header,
this indicates that node A is ECN-capable. If node B is also ECN- this indicates that node A is ECN-capable. If node B is also ECN-
capable, there are no obstacles to immediately setting one of the capable, there are no obstacles to immediately setting one of the
ECN-Capable codepoints in the IP header in the responding TCP SYN/ACK ECN-Capable codepoints in the IP header in the responding TCP SYN/ACK
packet. packet.
skipping to change at page 10, line 9 skipping to change at page 10, line 19
Second, the ECN-Capable codepoint in TCP SYN packets could be misused Second, the ECN-Capable codepoint in TCP SYN packets could be misused
by malicious clients to `improve' the well-known TCP SYN attack. By by malicious clients to `improve' the well-known TCP SYN attack. By
setting an ECN-Capable codepoint in TCP SYN packets, a malicious host setting an ECN-Capable codepoint in TCP SYN packets, a malicious host
might be able to inject a large number of TCP SYN packets through a might be able to inject a large number of TCP SYN packets through a
potentially congested ECN-enabled router, congesting it even further. potentially congested ECN-enabled router, congesting it even further.
For both these reasons, we continue the restriction that the TCP SYN For both these reasons, we continue the restriction that the TCP SYN
packet MUST NOT have the ECN-Capable codepoint in the IP header set. packet MUST NOT have the ECN-Capable codepoint in the IP header set.
Backwards compatibility:
In order for TCP node B to send a SYN/ACK packet as ECN-Capable, node
B must have received an ECN-setup SYN packet from node A. However,
it is possible that node A supports ECN, but either ignores the CE
codepoint on received SYN/ACK packets, or ignores SYN/ACK packets
with the ECT or CE codepoint set. If the TCP sender ignores the CE
codepoint on received SYN/ACK packets, this would mean that the TCP
connection would not respond to this congestion indication. However,
this seems to us an acceptable cost to pay in the incremental
deployment of ECN-Capability for TCP's SYN/ACK packets. It would
mean that the sender of the SYN/ACK packet would not reduce the
initial congestion window from two, three, or four segments down to
one segment, as it should. However, the TCP sender would still
respond correctly to any subsequent CE indications on data packets
later on in the connection. Thus, to be explicit, when a TCP
connection includes a sender that supports ECN but *does not* support
ECN-Capability for SYN/ACK packets, in combination with a receiver
that *does* support ECN-Capabililty for SYN/ACK packets, it is quite
possible that the ECN-Capable SYN/ACK packets will be marked rather
than dropped in the network, and that the sender will not respond to
the ECN mark on the SYN/ACK packet.
It is also possible that in some older TCP implementation, the TCP
sender would ignore arriving SYN/ACK packets that had the ECT or CE
codepoint set. This would result in a delay in connection set-up for
that TCP connection, with the TCP sender re-sending the SYN packet
after a retransmit timeout. We are not aware of any TCP
implementations with this behavior.
SYN/ACK packets and packet size: SYN/ACK packets and packet size:
There are a number of router buffer architectures that have smaller There are a number of router buffer architectures that have smaller
dropping rates for small (SYN) packets than for large (data) packets. dropping rates for small (SYN) packets than for large (data) packets.
For example, for a Drop Tail queue in units of packets, where each For example, for a Drop Tail queue in units of packets, where each
packet takes a single slot in the buffer regardless of packet size, packet takes a single slot in the buffer regardless of packet size,
small and large packets are equally likely to be dropped. However, small and large packets are equally likely to be dropped. However,
for a Drop Tail queue in units of bytes, small packets are less for a Drop Tail queue in units of bytes, small packets are less
likely to be dropped than are large ones. Similarly, for RED in likely to be dropped than are large ones. Similarly, for RED in
packet mode, small and large packets are equally likely to be dropped packet mode, small and large packets are equally likely to be dropped
or marked, while for RED in byte mode, a packet's chance of being or marked, while for RED in byte mode, a packet's chance of being
skipping to change at page 11, line 22 skipping to change at page 10, line 51
We believe that there are a wide range of behaviors in the real world We believe that there are a wide range of behaviors in the real world
in terms of the drop or mark behavior at routers as a function of in terms of the drop or mark behavior at routers as a function of
packet size [Tools] (Section 10). We note that all of these packet size [Tools] (Section 10). We note that all of these
alternatives listed above are available in the NS simulator (Drop alternatives listed above are available in the NS simulator (Drop
Tail queues are by default in units of packets, while the default for Tail queues are by default in units of packets, while the default for
RED queue management has been changed from packet mode to byte mode). RED queue management has been changed from packet mode to byte mode).
Response to ECN-marking of SYN/ACK packets: Response to ECN-marking of SYN/ACK packets:
One question is why TCP SYN/ACK packets should be treated differently One question is why TCP SYN/ACK packets should be treated differently
from other packets in terms of the packet sender's response to an from other packets in terms of the end node's response to an ECN-
ECN-marked packet. Section 5 of RFC 3168 specifies the following: marked packet. Section 5 of RFC 3168 specifies the following:
"Upon the receipt by an ECN-Capable transport of a single CE packet, "Upon the receipt by an ECN-Capable transport of a single CE packet,
the congestion control algorithms followed at the end-systems MUST be the congestion control algorithms followed at the end-systems MUST be
essentially the same as the congestion control response to a *single* essentially the same as the congestion control response to a *single*
dropped packet. For example, for ECN-Capable TCP the source TCP is dropped packet. For example, for ECN-Capable TCP the source TCP is
required to halve its congestion window for any window of data required to halve its congestion window for any window of data
containing either a packet drop or an ECN indication." containing either a packet drop or an ECN indication."
In particular, Section 6.1.2 of RFC 3168 specifies that when the TCP In particular, Section 6.1.2 of RFC 3168 specifies that when the TCP
congestion window consists of a single packet and that packet is ECN- congestion window consists of a single packet and that packet is ECN-
marked in the network, then the sender must reduce the sending rate marked in the network, then the data sender must reduce the sending
below one packet per round-trip time, by waiting for one RTO before rate below one packet per round-trip time, by waiting for one RTO
sending another packet. If the RTO was set to the average round-trip before sending another packet. If the RTO was set to the average
time, this would result in halving the sending rate; because the RTO round-trip time, this would result in halving the sending rate;
is in fact larger than the average round-trip time, the sending rate because the RTO is in fact larger than the average round-trip time,
is reduced to less than half of its previous value. the sending rate is reduced to less than half of its previous value.
TCP's congestion control response to the *dropping* of a SYN/ACK TCP's congestion control response to the *dropping* of a SYN/ACK
packet is to wait a default time before sending another packet. This packet is to wait a default time before sending another packet. This
document argues that ECN gives end-systems a wider range of possible document argues that ECN gives end-systems a wider range of possible
responses to the *marking* of a SYN/ACK packet, and that waiting a responses to the *marking* of a SYN/ACK packet, and that waiting a
default time before sending a data packet is not the desired default time before sending a data packet is not the desired
response. response.
On the conservative end, one could assume an effective congestion On the conservative end, one could assume an effective congestion
window of one packet for the SYN/ACK packet, and respond to an ECN- window of one packet for the SYN/ACK packet, and respond to an ECN-
skipping to change at page 12, line 17 skipping to change at page 11, line 46
seconds before sending a data packet. seconds before sending a data packet.
However, we note that for an ECN-marked SYN/ACK packet, halving the However, we note that for an ECN-marked SYN/ACK packet, halving the
*congestion window* is not the same as halving the *sending rate*; *congestion window* is not the same as halving the *sending rate*;
there is no `sending rate' associated with an ECN-Capable SYN/ACK there is no `sending rate' associated with an ECN-Capable SYN/ACK
packet, as such packets are only sent as the first packet in a packet, as such packets are only sent as the first packet in a
connection from that host. Further, a router's marking of a SYN/ACK connection from that host. Further, a router's marking of a SYN/ACK
packet is not affected by any past history of that connection. packet is not affected by any past history of that connection.
Adding ECN-Capability to SYN/ACK packets allows the simple response Adding ECN-Capability to SYN/ACK packets allows the simple response
of setting the initial congestion window to one packet, instead of of the responder setting the initial congestion window to one packet,
its allowed default value of two, three, or four packets, with the instead of its allowed default value of two, three, or four packets,
host proceeding with a cautious sending rate of one packet per round- with the responder proceeding with a cautious sending rate of one
trip time. If that packet is ECN-marked or dropped, then the sender packet per round-trip time. If that data packet is ECN-marked or
will wait an RTO before sending another packet. This document argues dropped, then the responder will wait an RTO before sending another
that this approach is useful to users, with no dangers of congestion packet. This document argues that this approach is useful to users,
collapse or of starvation of competing traffic. This is discussed in with no dangers of congestion collapse or of starvation of competing
more detail below in Section 6.2. traffic. This is discussed in more detail below in Section 6.2.
We note that if the data transfer is entirely from Node A to Node B, We note that if the data transfer is entirely from Node A to Node B,
then there is no effective difference between the two possible then there is no effective difference between the two possible
responses to an ECN-marked SYN/ACK packet outlined above. In either responses to an ECN-marked SYN/ACK packet outlined above. In either
case, Node B sends no data packets, only sending acknowledgement case, Node B sends no data packets, only sending acknowledgement
packets in response to received data packets. packets in response to received data packets.
5. Related Work 5. Related Work
The addition of ECN-capability to TCP's SYN/ACK packets was proposed The addition of ECN-capability to TCP's SYN/ACK packets was proposed
skipping to change at page 14, line 49 skipping to change at page 14, line 34
Thus, the degree of benefit of adding ECN-Capability to SYN/ACK Thus, the degree of benefit of adding ECN-Capability to SYN/ACK
packets depends not only on the overall packet drop rate in the packets depends not only on the overall packet drop rate in the
network, but also on the queue management architecture at the network, but also on the queue management architecture at the
congested link. congested link.
6.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK Packets 6.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK Packets
This document specifies that the end-node responds to the report of This document specifies that the end-node responds to the report of
an ECN-marked SYN/ACK packet by setting the initial congestion window an ECN-marked SYN/ACK packet by setting the initial congestion window
to one segment, instead of its possible default value of two to four to one segment, instead of its possible default value of two to four
segments. We call this ECN+ with NoWaiting. However, in Section 4 segments. We call this ECN+ with NoWaiting. However, Section 4
discussed another possible response to an ECN-marked SYN/ACK packet, discussed another possible response to an ECN-marked SYN/ACK packet,
of the end-node waiting an RTT before sending a data packet. We call of the end-node waiting an RTT before sending a data packet. We call
this approach ECN+ with Waiting. this approach ECN+ with Waiting.
Simulations comparing the performance with Standard ECN (without ECN- Simulations comparing the performance with Standard ECN (without ECN-
marked SYN/ACK packets), ECN+ with NoWaiting, and ECN+ with Waiting marked SYN/ACK packets), ECN+ with NoWaiting, and ECN+ with Waiting
show little difference, in terms of aggregate congestion, between show little difference, in terms of aggregate congestion, between
ECN+ with NoWaiting and ECN+ with Waiting. The details are given in ECN+ with NoWaiting and ECN+ with Waiting. The details are given in
Appendix A below. Our conclusions are that ECN+ with NoWaiting is Appendix A below. Our conclusions are that ECN+ with NoWaiting is
perfectly safe, and there are no congestion-related reasons for perfectly safe, and there are no congestion-related reasons for
skipping to change at page 16, line 10 skipping to change at page 15, line 42
Capable or CE codepoint in the IP header (over and above the routers Capable or CE codepoint in the IP header (over and above the routers
already known to crash when a data packet arrives with either ECT(0) already known to crash when a data packet arrives with either ECT(0)
or ECT(1)), but we have not conducted any measurement studies of this or ECT(1)), but we have not conducted any measurement studies of this
[F07]. [F07].
Congestion collapse: Congestion collapse:
Because TCP SYN/ACK packets carrying an ECT codepoint could be ECN- Because TCP SYN/ACK packets carrying an ECT codepoint could be ECN-
marked instead of dropped at an ECN-capable router, the concern is marked instead of dropped at an ECN-capable router, the concern is
whether this can either invoke congestion, or worsen performance in whether this can either invoke congestion, or worsen performance in
highly congested scenarios. However, after learning that a SYN/ACK highly congested scenarios. However, after learning that a SYN/ACK
packet was ECN-marked, the sender of that packet will only send one packet was ECN-marked, the responder will only send one data packet;
data packet; if this data packet is ECN-marked, the sender will then if this data packet is ECN-marked, the responder will then wait for a
wait for a retransmission timeout. In addition, routers are free to retransmission timeout. In addition, routers are free to drop rather
drop rather than mark arriving packets in times of high congestion, than mark arriving packets in times of high congestion, regardless of
regardless of whether the packets are ECN-capable. When congestion whether the packets are ECN-capable. When congestion is very high
is very high and a router's buffer is full, the router has no choice and a router's buffer is full, the router has no choice but to drop
but to drop rather than to mark an arriving packet. rather than to mark an arriving packet.
The simulations reported in Appendix A show that even with demanding The simulations reported in Appendix A show that even with demanding
traffic mixes dominated by short flows and high levels of congestion, traffic mixes dominated by short flows and high levels of congestion,
the aggregate packet dropping rates are not significantly different the aggregate packet dropping rates are not significantly different
with Standard ECN, ECN+ with NoWaiting, or ECN+ with Waiting. In with Standard ECN, ECN+ with NoWaiting, or ECN+ with Waiting. In
particular, the simulations show that in periods of very high particular, the simulations show that in periods of very high
congestion the packet-marking rate is low with or without ECN+, and congestion the packet-marking rate is low with or without ECN+, and
the use of ECN+ does not significantly increase the number of dropped the use of ECN+ does not significantly increase the number of dropped
or marked packets. or marked packets.
skipping to change at page 17, line 20 skipping to change at page 17, line 7
the server to more appropriately adjust the initial load it places on the server to more appropriately adjust the initial load it places on
the network. the network.
Future work will address the more general question of adding ECN- Future work will address the more general question of adding ECN-
Capability to relevant handshake packets in other protocols that use Capability to relevant handshake packets in other protocols that use
retransmission-based reliability in their setup phase (e.g., SCTP, retransmission-based reliability in their setup phase (e.g., SCTP,
DCCP, HIP, and the like). DCCP, HIP, and the like).
9. Acknowledgements 9. Acknowledgements
We thank Anil Agarwal, Mark Allman, Wesley Eddy, Janardhan Iyengar, We thank Anil Agarwal, Mark Allman, Remi Denis-Courmont, Wesley Eddy,
and Pasi Sarolahti for feedback on earlier versions of this draft. Alfred Hoenes, Janardhan Iyengar, and Pasi Sarolahti for feedback on
earlier versions of this draft.
A. Report on Simulations A. Report on Simulations
This section reports on simulations showing the costs of adding ECN+ This section reports on simulations showing the costs of adding ECN+
in highly-congested scenarios. This section also reports on in highly-congested scenarios. This section also reports on
simulations for a comparative evaluation between ECN+ with NoWaiting simulations for a comparative evaluation between ECN+ with NoWaiting
and ECN+ with Waiting. and ECN+ with Waiting.
The simulations are run with a range of file-size distributions. As The simulations are run with a range of file-size distributions. As
a baseline, they use the empirical heavy-tailed distribution reported a baseline, they use the empirical heavy-tailed distribution reported
skipping to change at page 17, line 44 skipping to change at page 17, line 32
lower and higher values to get distributions with mean file sizes of lower and higher values to get distributions with mean file sizes of
3 KBytes, 5 KBytes, 14 KBytes and 17 KBytes. The congested link is 3 KBytes, 5 KBytes, 14 KBytes and 17 KBytes. The congested link is
100 Mbps. RED is run in gentle mode, and arriving ECN-Capable 100 Mbps. RED is run in gentle mode, and arriving ECN-Capable
packets are only dropped instead of marked if the buffer is full (and packets are only dropped instead of marked if the buffer is full (and
the router has no choice). the router has no choice).
We explore two alternatives for a TCP node's response to a report of We explore two alternatives for a TCP node's response to a report of
an ECN-marked SYN/ACK packet. With ECN+ with NoWaiting, the TCP node an ECN-marked SYN/ACK packet. With ECN+ with NoWaiting, the TCP node
sends a data packet immediately (with an initial congestion window of sends a data packet immediately (with an initial congestion window of
one segment). With the alternative ECN+ with Waiting, the TCP node one segment). With the alternative ECN+ with Waiting, the TCP node
waits a round-trip time before sending a data packet; the sender waits a round-trip time before sending a data packet; the responder
already has one measurement of the round-trip time when the already has one measurement of the round-trip time when the
acknowledgement for the SYN/ACK packet is received. acknowledgement for the SYN/ACK packet is received.
In the tables below, ECN+ refers to ECN+ with NoWaiting, where the In the tables below, ECN+ refers to ECN+ with NoWaiting, where the
sender starts transmitting immediately, and ECN+/wait refers to ECN+ responder starts transmitting immediately, and ECN+/wait refers to
with Waiting, where the sender waits a round-trip time before sending ECN+ with Waiting, where the responder waits a round-trip time before
a data packet into the network. sending a data packet into the network.
The simulation scripts are available on [ECN-SYN], along with graphs The simulation scripts are available on [ECN-SYN], along with graphs
showing the distribution of response times for the TCP connections. showing the distribution of response times for the TCP connections.
A.1. Simulations with RED in Packet Mode A.1. Simulations with RED in Packet Mode
The simulations with RED in packet mode and with the queue in packets The simulations with RED in packet mode and with the queue in packets
show that ECN+ is useful in times of moderate congestion, though it show that ECN+ is useful in times of moderate congestion, though it
adds little benefit in times of high congestion. The simulations adds little benefit in times of high congestion. The simulations
show a minimal increase in levels of congestion with either ECN+ with show a minimal increase in levels of congestion with either ECN+ with
skipping to change at page 19, line 41 skipping to change at page 19, line 41
Traffic Load = 200%: Traffic Load = 200%:
ECN ECN+ ECN+/wait ECN ECN+ ECN+/wait
------- ------- ------- ------- ------- -------
Loss rate 29.99% 30.22% 30.23% Loss rate 29.99% 30.22% 30.23%
Table 1: Simulations with an average flow size of 3 Kbytes, RED in Table 1: Simulations with an average flow size of 3 Kbytes, RED in
packet mode, queue in packets. packet mode, queue in packets.
A.2. Simulations with RED in Byte Mode A.2. Simulations with RED in Byte Mode
Table 3 below shows simulations with RED in byte mode and the queue Table 2 below shows simulations with RED in byte mode and the queue
in bytes. Like the simulations with RED in packet mode, there is no in bytes. Like the simulations with RED in packet mode, there is no
significant increase in aggregate congestion with the use of ECN+ or significant increase in aggregate congestion with the use of ECN+ or
ECN+/wait, and no congestion-related reason to prefer ECN+/wait over ECN+/wait, and no congestion-related reason to prefer ECN+/wait over
ECN+. ECN+.
However, unlike the simulations with RED in packet mode, the However, unlike the simulations with RED in packet mode, the
simulations with RED in byte mode show little benefit from the use of simulations with RED in byte mode show little benefit from the use of
ECN+ or ECN+/wait, in that the packet marking rate with ECN+ or ECN+ or ECN+/wait, in that the packet marking rate with ECN+ or
ECN+/wait is not much different than the packet marking rate with ECN+/wait is not much different than the packet marking rate with
Standard ECN. This is because with RED in byte mode, small packets Standard ECN. This is because with RED in byte mode, small packets
skipping to change at page 20, line 33 skipping to change at page 20, line 33
Marked 4,086 4,644 4,826 Marked 4,086 4,644 4,826
Loss rate 5.90% 5.78% 5.81% Loss rate 5.90% 5.78% 5.81%
Traffic Load = 125%: Traffic Load = 125%:
ECN ECN+ ECN+/wait ECN ECN+ ECN+/wait
------- ------- ------- ------- ------- -------
Dropped 157,305 157,435 158,368 Dropped 157,305 157,435 158,368
Marked 2,183 2,363 2,663 Marked 2,183 2,363 2,663
Loss rate 9.89% 9.87% 9.93% Loss rate 9.89% 9.87% 9.93%
Table 3: Simulations with an average flow size of 3 Kbytes, RED in Table 2: Simulations with an average flow size of 3 Kbytes, RED in
byte mode, queue in bytes. byte mode, queue in bytes.
B. Issues of Incremental Deployment
In order for TCP node B to send a SYN/ACK packet as ECN-Capable, node
B must have received an ECN-setup SYN packet from node A. However,
it is possible that node A supports ECN, but either ignores the CE
codepoint on received SYN/ACK packets, or ignores SYN/ACK packets
with the ECT or CE codepoint set. If the TCP initiator ignores the
CE codepoint on received SYN/ACK packets, this would mean that the
TCP responder would not respond to this congestion indication.
However, this seems to us an acceptable cost to pay in the
incremental deployment of ECN-Capability for TCP's SYN/ACK packets.
It would mean that the responder would not reduce the initial
congestion window from two, three, or four segments down to one
segment, as it should. However, the TCP end nodes would still
respond correctly to any subsequent CE indications on data packets
later on in the connection.
Figure 3 shows an interchange with the SYN/ACK packet ECN-marked, but
with the ECN mark ignored by the TCP originator.
---------------------------------------------------------------
TCP Node A Router TCP Node B
---------- ------ ----------
ECN-setup SYN packet --->
ECN-setup SYN packet --->
<--- ECN-setup SYN/ACK, ECT
<--- Sets CE on SYN/ACK
<--- ECN-setup SYN/ACK, CE
Data/ACK, No ECN-Echo --->
Data/ACK --->
<--- Data (up to four packets)
---------------------------------------------------------------
Figure 3: SYN exchange with the SYN/ACK packet marked,
but with the ECN mark ignored by the TCP initiator.
Thus, to be explicit, when a TCP connection includes an initiator
that supports ECN but *does not* support ECN-Capability for SYN/ACK
packets, in combination with a responder that *does* support ECN-
Capabililty for SYN/ACK packets, it is possible that the ECN-Capable
SYN/ACK packets will be marked rather than dropped in the network,
and that the responder will not learn about the ECN mark on the
SYN/ACK packet. This would not be a problem if most packets from the
responder supporting ECN for SYN/ACK packets were in long-lived TCP
connections, but it would be more problematic if most of the packets
were from TCP connections consisting of four data packets, and the
TCP responder for these connections was ready to send its data
packets immediately after the SYN/ACK exchange. Of course, with
*severe* congestion, the SYN/ACK packets would likely be dropped
rather than ECN-marked at the congested router, preventing the TCP
responder from adding to the congestion by sending its initial window
of four data packets.
It is also possible that in some older TCP implementation, the
initiator would ignore arriving SYN/ACK packets that had the ECT or
CE codepoint set. This would result in a delay in connection set-up
for that TCP connection, with the initiator re-sending the SYN packet
after a retransmit timeout. We are not aware of any TCP
implementations with this behavior.
One possibility for coping with problems of backwards compatibility
would be for TCP initiators to use a TCP flag that means "I
understand ECN-Capable SYN/ACK packets". If this document were to
standardize the use of such an "ECN-SYN" flag, then the TCP responder
would only send a SYN/ACK packet as ECN-capable if the incoming SYN
packet had the "ECN-SYN" flag set. An ECN-SYN flag would prevent the
backwards compatibility problems described in the paragraphs above.
One drawback to the use of an ECN-SYN flag is that it would use one
of the four remaining reserved bits in the TCP header, for a
transient backwards compatibility problem. This drawback is limited
by the fact that the "ECN-SYN" flag would be defined only for use
with ECN-setup SYN packets; that bit in the TCP header could be
defined to have other uses for other kinds of TCP packets.
Factors in deciding not to use an ECN-SYN flag include the following:
(1) The limited installed base: At the time that this document was
written, the TCP implementations in Microsoft Vista and Mac OS X
included ECN, but ECN was not enabled by default [SBT07]. Thus,
there was not a large deployed base of ECN-Capable TCP
implementations. This limits the scope of any backwards
compatibility problems.
(2) Limits to the scope of the problem: The backwards compatibility
problem would not be serious enough to cause congestion collapse;
with severe congestion, the buffer at the congested router will
overflow, and the congested router will drop rather than ECN-mark
arriving SYN packets. Some active queue management mechanisms might
switch from packet-marking to packet-dropping in times of high
congestion before buffer overflow, as recommended in Section 19.1 of
RFC 3168. This helps to prevent congestion collapse problems with
the use of ECN.
(3) Detection of and response to backwards-compatibility problems: A
TCP responder such as a web server can't differentiate between a
SYN/ACK packet that is not ECN-marked in the network, and a SYN/ACK
packet that is ECN-marked, but where the ECN mark is ignored by the
TCP initiator. However, a TCP responder *can* detect if a SYN/ACK
packet is sent as ECN-capable and not reported as ECN-marked, but
data packets are dropped or marked from the initial window of data.
We will call this scenario "initial-window-congestion". If a web
server frequently experienced initial-window congestion (without
SYN/ACK congestion), then the web server *might* be experiencing
backwards compatibility problems with ECN-Capable SYN/ACK packets,
and could respond by not sending SYN/ACK packets as ECN-Capable.
Normative References Normative References
[RFC 2119] S. Bradner, Key words for use in RFCs to Indicate [RFC 2119] S. Bradner, Key words for use in RFCs to Indicate
Requirement Levels, RFC 2119, March 1997. Requirement Levels, RFC 2119, March 1997.
[RFC3168] K.K. Ramakrishnan, S. Floyd, and D. Black, The Addition of [RFC3168] K.K. Ramakrishnan, S. Floyd, and D. Black, The Addition of
Explicit Congestion Notification (ECN) to IP, RFC 3168, Proposed Explicit Congestion Notification (ECN) to IP, RFC 3168, Proposed
Standard, September 2001. Standard, September 2001.
Informative References Informative References
[ECN+] A. Kuzmanovic, The Power of Explicit Congestion Notification, [ECN+] A. Kuzmanovic, The Power of Explicit Congestion Notification,
SIGCOMM 2005. SIGCOMM 2005.
[ECN-SYN] ECN-SYN web page with simulation scripts, URL to be added. [ECN-SYN] ECN-SYN web page with simulation scripts, URL to be added.
[F07] S. Floyd, "[BEHAVE] Response of firewalls and middleboxes to [F07] S. Floyd, "[BEHAVE] Response of firewalls and middleboxes to
TCP SYN packets that are ECN-Capable?", August 2, 2007, email sent to TCP SYN packets that are ECN-Capable?", August 2, 2007, email sent to
the BEHAVE mailing list, URL "http://www1.ietf.org/mail- the BEHAVE mailing list, URL "http://www1.ietf.org/mail-
archive/web/behave/current/msg02644.html".` archive/web/behave/current/msg02644.html".
[Kelson00] Dax Kelson, note sent to the Linux kernel mailing list, [Kelson00] Dax Kelson, note sent to the Linux kernel mailing list,
September 10, 2000. September 10, 2000.
[MAF05] A. Medina, M. Allman, and S. Floyd. Measuring the Evolution [MAF05] A. Medina, M. Allman, and S. Floyd. Measuring the Evolution
of Transport Protocols in the Internet, ACM CCR, April 2005. of Transport Protocols in the Internet, ACM CCR, April 2005.
[PI] C. Hollot, V. Misra, W. Gong, and D. Towsley, On Designing [PI] C. Hollot, V. Misra, W. Gong, and D. Towsley, On Designing
Improved Controllers for AQM Routers Supporting TCP Flows, April Improved Controllers for AQM Routers Supporting TCP Flows, April
1998. 1998.
 End of changes. 32 change blocks. 
112 lines changed or deleted 220 lines changed or added

This html diff was produced by rfcdiff 1.34. The latest version is available from http://tools.ietf.org/tools/rfcdiff/