draft-ietf-tsvwg-byte-pkt-congest-11.txt   draft-ietf-tsvwg-byte-pkt-congest-12.txt 
Transport Area Working Group B. Briscoe Transport Area Working Group B. Briscoe
Internet-Draft BT Internet-Draft BT
Updates: 2309 (if approved) J. Manner Updates: 2309 (if approved) J. Manner
Intended status: BCP Aalto University Intended status: BCP Aalto University
Expires: February 2, 2014 August 1, 2013 Expires: May 11, 2014 November 07, 2013
Byte and Packet Congestion Notification Byte and Packet Congestion Notification
draft-ietf-tsvwg-byte-pkt-congest-11 draft-ietf-tsvwg-byte-pkt-congest-12
Abstract Abstract
This document provides recommendations of best current practice for This document provides recommendations of best current practice for
dropping or marking packets using any active queue management (AQM) dropping or marking packets using any active queue management (AQM)
algorithm, including random early detection (RED), BLUE, pre- algorithm, including random early detection (RED), BLUE, pre-
congestion notification (PCN) and newer schemes such as CoDel and congestion notification (PCN) and newer schemes such as CoDel
PIE. We give three strong recommendations: (1) packet size should be (Controlled Delay) and PIE (Proportional Integral controller
taken into account when transports detect and respond to congestion Enhanced). We give three strong recommendations: (1) packet size
indications, (2) packet size should not be taken into account when should be taken into account when transports detect and respond to
network equipment creates congestion signals (marking, dropping), and congestion indications, (2) packet size should not be taken into
therefore (3) in the specific case of RED, the byte-mode packet drop account when network equipment creates congestion signals (marking,
variant that drops fewer small packets should not be used. This memo dropping), and therefore (3) in the specific case of RED, the byte-
updates RFC 2309 to deprecate deliberate preferential treatment of mode packet drop variant that drops fewer small packets should not be
small packets in AQM algorithms. used. This memo updates RFC 2309 to deprecate deliberate
preferential treatment of small packets in AQM algorithms.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on February 2, 2014. This Internet-Draft will expire on May 11, 2014.
Copyright Notice Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 23 skipping to change at page 3, line 23
2.3. Recommendation on Responding to Congestion . . . . . . . . 11 2.3. Recommendation on Responding to Congestion . . . . . . . . 11
2.4. Recommendation on Handling Congestion Indications when 2.4. Recommendation on Handling Congestion Indications when
Splitting or Merging Packets . . . . . . . . . . . . . . . 12 Splitting or Merging Packets . . . . . . . . . . . . . . . 12
3. Motivating Arguments . . . . . . . . . . . . . . . . . . . . . 12 3. Motivating Arguments . . . . . . . . . . . . . . . . . . . . . 12
3.1. Avoiding Perverse Incentives to (Ab)use Smaller Packets . 12 3.1. Avoiding Perverse Incentives to (Ab)use Smaller Packets . 12
3.2. Small != Control . . . . . . . . . . . . . . . . . . . . . 14 3.2. Small != Control . . . . . . . . . . . . . . . . . . . . . 14
3.3. Transport-Independent Network . . . . . . . . . . . . . . 14 3.3. Transport-Independent Network . . . . . . . . . . . . . . 14
3.4. Partial Deployment of AQM . . . . . . . . . . . . . . . . 15 3.4. Partial Deployment of AQM . . . . . . . . . . . . . . . . 15
3.5. Implementation Efficiency . . . . . . . . . . . . . . . . 17 3.5. Implementation Efficiency . . . . . . . . . . . . . . . . 17
4. A Survey and Critique of Past Advice . . . . . . . . . . . . . 17 4. A Survey and Critique of Past Advice . . . . . . . . . . . . . 17
4.1. Congestion Measurement Advice . . . . . . . . . . . . . . 17 4.1. Congestion Measurement Advice . . . . . . . . . . . . . . 18
4.1.1. Fixed Size Packet Buffers . . . . . . . . . . . . . . 18 4.1.1. Fixed Size Packet Buffers . . . . . . . . . . . . . . 18
4.1.2. Congestion Measurement without a Queue . . . . . . . . 19 4.1.2. Congestion Measurement without a Queue . . . . . . . . 19
4.2. Congestion Notification Advice . . . . . . . . . . . . . . 20 4.2. Congestion Notification Advice . . . . . . . . . . . . . . 20
4.2.1. Network Bias when Encoding . . . . . . . . . . . . . . 20 4.2.1. Network Bias when Encoding . . . . . . . . . . . . . . 20
4.2.2. Transport Bias when Decoding . . . . . . . . . . . . . 21 4.2.2. Transport Bias when Decoding . . . . . . . . . . . . . 22
4.2.3. Making Transports Robust against Control Packet 4.2.3. Making Transports Robust against Control Packet
Losses . . . . . . . . . . . . . . . . . . . . . . . . 23 Losses . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.4. Congestion Notification: Summary of Conflicting 4.2.4. Congestion Notification: Summary of Conflicting
Advice . . . . . . . . . . . . . . . . . . . . . . . . 23 Advice . . . . . . . . . . . . . . . . . . . . . . . . 24
5. Outstanding Issues and Next Steps . . . . . . . . . . . . . . 24 5. Outstanding Issues and Next Steps . . . . . . . . . . . . . . 25
5.1. Bit-congestible Network . . . . . . . . . . . . . . . . . 24 5.1. Bit-congestible Network . . . . . . . . . . . . . . . . . 25
5.2. Bit- & Packet-congestible Network . . . . . . . . . . . . 25 5.2. Bit- & Packet-congestible Network . . . . . . . . . . . . 25
6. Security Considerations . . . . . . . . . . . . . . . . . . . 25 6. Security Considerations . . . . . . . . . . . . . . . . . . . 26
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 26 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 26
8. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 26 8. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 26
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 27 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 28
10. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 28 10. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 28
11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 28 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 28
11.1. Normative References . . . . . . . . . . . . . . . . . . . 28 11.1. Normative References . . . . . . . . . . . . . . . . . . . 28
11.2. Informative References . . . . . . . . . . . . . . . . . . 28 11.2. Informative References . . . . . . . . . . . . . . . . . . 28
Appendix A. Survey of RED Implementation Status . . . . . . . . . 32 Appendix A. Survey of RED Implementation Status . . . . . . . . . 32
Appendix B. Sufficiency of Packet-Mode Drop . . . . . . . . . . . 33 Appendix B. Sufficiency of Packet-Mode Drop . . . . . . . . . . . 34
B.1. Packet-Size (In)Dependence in Transports . . . . . . . . . 34 B.1. Packet-Size (In)Dependence in Transports . . . . . . . . . 35
B.2. Bit-Congestible and Packet-Congestible Indications . . . . 37 B.2. Bit-Congestible and Packet-Congestible Indications . . . . 38
Appendix C. Byte-mode Drop Complicates Policing Congestion Appendix C. Byte-mode Drop Complicates Policing Congestion
Response . . . . . . . . . . . . . . . . . . . . . . 38 Response . . . . . . . . . . . . . . . . . . . . . . 39
Appendix D. Changes from Previous Versions . . . . . . . . . . . 39 Appendix D. Changes from Previous Versions . . . . . . . . . . . 40
1. Introduction 1. Introduction
This document provides recommendations of best current practice for This document provides recommendations of best current practice for
how we should correctly scale congestion control functions with how we should correctly scale congestion control functions with
respect to packet size for the long term. It also recognises that respect to packet size for the long term. It also recognises that
expediency may be necessary to deal with existing widely deployed expediency may be necessary to deal with existing widely deployed
protocols that don't live up to the long term goal. protocols that don't live up to the long term goal.
When signalling congestion, the problem of how (and whether) to take When signalling congestion, the problem of how (and whether) to take
skipping to change at page 5, line 29 skipping to change at page 5, line 29
In the particular case of Random early Detection (RED), this means In the particular case of Random early Detection (RED), this means
that the byte-mode packet drop variant should not be used to drop that the byte-mode packet drop variant should not be used to drop
fewer small packets, because that creates a perverse incentive for fewer small packets, because that creates a perverse incentive for
transports to use tiny segments, consequently also opening up a DoS transports to use tiny segments, consequently also opening up a DoS
vulnerability. Fortunately all the RED implementers who responded to vulnerability. Fortunately all the RED implementers who responded to
our admittedly limited survey (Section 4.2.4) have not followed the our admittedly limited survey (Section 4.2.4) have not followed the
earlier advice to use byte-mode drop, so the position this memo earlier advice to use byte-mode drop, so the position this memo
argues for seems to already exist in implementations. argues for seems to already exist in implementations.
However, at the transport layer, TCP congestion control is a widely However, at the transport layer, TCP congestion control is a widely
deployed protocol that doesn't scale with packet size. To date this deployed protocol that doesn't scale with packet size (i.e. its
hasn't been a significant problem because most TCP implementations reduction in rate does not take into account the size of a lost
have been used with similar packet sizes. But, as we design new packet). To date this hasn't been a significant problem because most
congestion control mechanisms, this memo recommends that we should TCP implementations have been used with similar packet sizes. But,
build in scaling with packet size rather than assuming we should as we design new congestion control mechanisms, this memo recommends
follow TCP's example. that we should build in scaling with packet size rather than assuming
we should follow TCP's example.
This memo continues as follows. First it discusses terminology and This memo continues as follows. First it discusses terminology and
scoping. Section 2 gives the concrete formal recommendations, scoping. Section 2 gives the concrete formal recommendations,
followed by motivating arguments in Section 3. We then critically followed by motivating arguments in Section 3. We then critically
survey the advice given previously in the RFC series and the research survey the advice given previously in the RFC series and the research
literature (Section 4), referring to an assessment of whether or not literature (Section 4), referring to an assessment of whether or not
this advice has been followed in production networks (Appendix A). this advice has been followed in production networks (Appendix A).
To wrap up, outstanding issues are discussed that will need To wrap up, outstanding issues are discussed that will need
resolution both to inform future protocol designs and to handle resolution both to inform future protocol designs and to handle
legacy (Section 5). Then security issues are collected together in legacy (Section 5). Then security issues are collected together in
skipping to change at page 6, line 37 skipping to change at page 6, line 38
virtual limit smaller than the actual limit to the resource, then virtual limit smaller than the actual limit to the resource, then
notify when this virtual limit is exceeded in order to avoid notify when this virtual limit is exceeded in order to avoid
uncontrolled congestion of the actual capacity. uncontrolled congestion of the actual capacity.
Congestion notification communicates a real number bounded by the Congestion notification communicates a real number bounded by the
range [ 0 , 1 ]. This ties in with the most well-understood range [ 0 , 1 ]. This ties in with the most well-understood
measure of congestion notification: drop probability. measure of congestion notification: drop probability.
Explicit and Implicit Notification: The byte vs. packet dilemma Explicit and Implicit Notification: The byte vs. packet dilemma
concerns congestion notification irrespective of whether it is concerns congestion notification irrespective of whether it is
signalled implicitly by drop or using explicit congestion signalled implicitly by drop or using Explicit Congestion
notification (ECN [RFC3168] or PCN [RFC5670]). Throughout this Notification (ECN [RFC3168] or PCN [RFC5670]). Throughout this
document, unless clear from the context, the term marking will be document, unless clear from the context, the term marking will be
used to mean notifying congestion explicitly, while congestion used to mean notifying congestion explicitly, while congestion
notification will be used to mean notifying congestion either notification will be used to mean notifying congestion either
implicitly by drop or explicitly by marking. implicitly by drop or explicitly by marking.
Bit-congestible vs. Packet-congestible: If the load on a resource Bit-congestible vs. Packet-congestible: If the load on a resource
depends on the rate at which packets arrive, it is called packet- depends on the rate at which packets arrive, it is called packet-
congestible. If the load depends on the rate at which bits arrive congestible. If the load depends on the rate at which bits arrive
it is called bit-congestible. it is called bit-congestible.
skipping to change at page 8, line 41 skipping to change at page 8, line 43
size. Because there are 25 times more small packets in one second, size. Because there are 25 times more small packets in one second,
it naturally drops 25 times more small packets, that is 100 small it naturally drops 25 times more small packets, that is 100 small
packets but only 4 large packets. But if we count how many bits it packets but only 4 large packets. But if we count how many bits it
drops, there are 48,000 bits in 100 small packets and 48,000 bits in drops, there are 48,000 bits in 100 small packets and 48,000 bits in
4 large packets--the same number of bits of small packets as large. 4 large packets--the same number of bits of small packets as large.
The packet-mode drop algorithm drops any bit with the same The packet-mode drop algorithm drops any bit with the same
probability whether the bit is in a small or a large packet. probability whether the bit is in a small or a large packet.
For byte-mode drop, again we use an example drop probability of 0.1%, For byte-mode drop, again we use an example drop probability of 0.1%,
but only for maximum size packets (assuming the link MTU is 1,500B or but only for maximum size packets (assuming the link maximum
12,000b). The byte-mode algorithm reduces the drop probability of transmission unit (MTU) is 1,500B or 12,000b). The byte-mode
smaller packets proportional to their size, making the probability algorithm reduces the drop probability of smaller packets
that it drops a small packet 25 times smaller at 0.004%. But there proportional to their size, making the probability that it drops a
are 25 times more small packets, so dropping them with 25 times lower small packet 25 times smaller at 0.004%. But there are 25 times more
probability results in dropping the same number of packets: 4 drops small packets, so dropping them with 25 times lower probability
in both cases. The 4 small dropped packets contain 25 times less results in dropping the same number of packets: 4 drops in both
bits than the 4 large dropped packets: 1,920 compared to 48,000. cases. The 4 small dropped packets contain 25 times less bits than
the 4 large dropped packets: 1,920 compared to 48,000.
The byte-mode drop algorithm drops any bit with a probability The byte-mode drop algorithm drops any bit with a probability
proportionate to the size of the packet it is in. proportionate to the size of the packet it is in.
2. Recommendations 2. Recommendations
This section gives recommendations related to network equipment in This section gives recommendations related to network equipment in
Sections 2.1 and 2.2, and in Sections 2.3 and 2.4 we discuss the Sections 2.1 and 2.2, and in Sections 2.3 and 2.4 we discuss the
implications on the transport protocols. implications on the transport protocols.
2.1. Recommendation on Queue Measurement 2.1. Recommendation on Queue Measurement
Ideally, an AQM would measure the service time of the queue to Ideally, an AQM would measure the service time of the queue to
measure congestion of a resource. However service time can only be measure congestion of a resource. However service time can only be
measured as packets leave the queue, where it is not always feasible measured as packets leave the queue, where it is not always expedient
to implement a full AQM algorithm. To predict the service time as to implement a full AQM algorithm. To predict the service time as
packets join the queue, an AQM algorithm needs to measure the length packets join the queue, an AQM algorithm needs to measure the length
of the queue. of the queue.
In this case, if the resource is bit-congestible, the AQM In this case, if the resource is bit-congestible, the AQM
implementation SHOULD measure the length of the queue in bytes and, implementation SHOULD measure the length of the queue in bytes and,
if the resource is packet-congestible, the implementation SHOULD if the resource is packet-congestible, the implementation SHOULD
measure the length of the queue in packets. No other choice makes measure the length of the queue in packets. Subject to the
sense, because the number of packets waiting in the queue isn't exceptions below, no other choice makes sense, because the number of
relevant if the resource gets congested by bytes and vice versa. For packets waiting in the queue isn't relevant if the resource gets
example, the length of the queue into a transmission line would be congested by bytes and vice versa. For example, the length of the
measured in bytes, while the length of the queue into a firewall queue into a transmission line would be measured in bytes, while the
would be measured in packets. length of the queue into a firewall would be measured in packets.
To avoid the pathological effects of drop tail, the AQM can then To avoid the pathological effects of drop tail, the AQM can then
transform this service time or queue length into the probability of transform this service time or queue length into the probability of
dropping or marking a packet (e.g. RED's piecewise linear function dropping or marking a packet (e.g. RED's piecewise linear function
between thresholds). between thresholds).
What this advice means for RED as a specific example: What this advice means for RED as a specific example:
1. A RED implementation SHOULD use byte mode queue measurement for 1. A RED implementation SHOULD use byte mode queue measurement for
measuring the congestion of bit-congestible resources and packet measuring the congestion of bit-congestible resources and packet
mode queue measurement for packet-congestible resources. mode queue measurement for packet-congestible resources.
2. An implementation SHOULD NOT make it possible to configure the 2. An implementation SHOULD NOT make it possible to configure the
way a queue measures itself, because whether a queue is bit- way a queue measures itself, because whether a queue is bit-
congestible or packet-congestible is an inherent property of the congestible or packet-congestible is an inherent property of the
queue. queue.
Exceptions to these recommendations MAY be necessary, for instance Exceptions to these recommendations might be necessary, for instance
where a packet-congestible resource has to be configured as a proxy where a packet-congestible resource has to be configured as a proxy
bottleneck for a bit-congestible resource in an adjacent box that bottleneck for a bit-congestible resource in an adjacent box that
does not support AQM. does not support AQM.
The recommended approach in less straightforward scenarios, such as The recommended approach in less straightforward scenarios, such as
fixed size packet buffers, resources without a queue and buffers fixed size packet buffers, resources without a queue and buffers
comprising a mix of packet and bit-congestible resources, is comprising a mix of packet and bit-congestible resources, is
discussed in Section 4.1. For instance, Section 4.1.1 explains that discussed in Section 4.1. For instance, Section 4.1.1 explains that
the queue into a line should be measured in bytes even if the queue the queue into a line should be measured in bytes even if the queue
consists of fixed-size packet-buffers, because the root-cause of any consists of fixed-size packet-buffers, because the root-cause of any
skipping to change at page 11, line 23 skipping to change at page 11, line 24
marked, it SHOULD consider the strength of the congestion indication marked, it SHOULD consider the strength of the congestion indication
as proportionate to the size in octets (bytes) of the missing or as proportionate to the size in octets (bytes) of the missing or
marked packet. marked packet.
In other words, when a packet indicates congestion (by being lost or In other words, when a packet indicates congestion (by being lost or
marked) it can be considered conceptually as if there is a congestion marked) it can be considered conceptually as if there is a congestion
indication on every octet of the packet, not just one indication per indication on every octet of the packet, not just one indication per
packet. packet.
To be clear, the above recommendation solely describes how a To be clear, the above recommendation solely describes how a
transport should interpret the meaning of a congestion indication. transport should interpret the meaning of a congestion indication, as
It makes no recommendation on whether a transport should act a long term goal. It makes no recommendation on whether a transport
differently based on this interpretation. It merely aids should act differently based on this interpretation. It merely aids
interoperablity between transports, if they choose to make their interoperablity between transports, if they choose to make their
actions depend on the strength of congestion indications. actions depend on the strength of congestion indications.
This definition will be useful as the IETF transport area continues This definition will be useful as the IETF transport area continues
its programme of; its programme of;
o updating host-based congestion control protocols to take account o updating host-based congestion control protocols to take account
of packet size of packet size
o making transports less sensitive to losing control packets like o making transports less sensitive to losing control packets like
skipping to change at page 12, line 6 skipping to change at page 12, line 8
2. If it is desired to improve TCP performance by reducing the 2. If it is desired to improve TCP performance by reducing the
chance that a SYN or a pure ACK will be dropped, this SHOULD be chance that a SYN or a pure ACK will be dropped, this SHOULD be
done by modifying TCP (Section 4.2.3), not network equipment. done by modifying TCP (Section 4.2.3), not network equipment.
To be clear, we are not recommending at all that TCPs under To be clear, we are not recommending at all that TCPs under
equivalent conditions should aim for equal bit-rates. We are merely equivalent conditions should aim for equal bit-rates. We are merely
saying that anyone trying to do such a thing should modify their TCP saying that anyone trying to do such a thing should modify their TCP
algorithm, not the network. algorithm, not the network.
These recommendations are phrased as 'SHOULD' rather than 'MUST', These recommendations are phrased as 'SHOULD' rather than 'MUST',
because there may be cases where compatibility with pre-existing because there may be cases where expediency dictates that
versions of a transport protocol make the recommendations compatibility with pre-existing versions of a transport protocol make
impractical. the recommendations impractical.
2.4. Recommendation on Handling Congestion Indications when Splitting 2.4. Recommendation on Handling Congestion Indications when Splitting
or Merging Packets or Merging Packets
Packets carrying congestion indications may be split or merged in Packets carrying congestion indications may be split or merged in
some circumstances (e.g. at a RTP/RTCP transcoder or during IP some circumstances (e.g. at a RTP/RTCP transcoder or during IP
fragment reassembly). Splitting and merging only make sense in the fragment reassembly). Splitting and merging only make sense in the
context of ECN, not loss. context of ECN, not loss.
The general rule to follow is that the number of octets in packets The general rule to follow is that the number of octets in packets
skipping to change at page 23, line 22 skipping to change at page 23, line 27
Recently, two RFCs have defined changes to TCP that make it more Recently, two RFCs have defined changes to TCP that make it more
robust against losing small control packets [RFC5562] [RFC5690]. In robust against losing small control packets [RFC5562] [RFC5690]. In
both cases they note that the case for these two TCP changes would be both cases they note that the case for these two TCP changes would be
weaker if RED were biased against dropping small packets. We argue weaker if RED were biased against dropping small packets. We argue
here that these two proposals are a safer and more principled way to here that these two proposals are a safer and more principled way to
achieve TCP performance improvements than reverse engineering RED to achieve TCP performance improvements than reverse engineering RED to
benefit TCP. benefit TCP.
Although there are no known proposals, it would also be possible and Although there are no known proposals, it would also be possible and
perfectly valid to make control packets robust against drop by perfectly valid to make control packets robust against drop by
explicitly requesting a lower drop probability using their Diffserv requesting a scheduling class with lower drop probability, by re-
code point [RFC2474] to request a scheduling class with lower drop. marking to a Diffserv code point [RFC2474] within the same behaviour
aggregate.
Although not brought to the IETF, a simple proposal from Wischik Although not brought to the IETF, a simple proposal from Wischik
[DupTCP] suggests that the first three packets of every TCP flow [DupTCP] suggests that the first three packets of every TCP flow
should be routinely duplicated after a short delay. It shows that should be routinely duplicated after a short delay. It shows that
this would greatly improve the chances of short flows completing this would greatly improve the chances of short flows completing
quickly, but it would hardly increase traffic levels on the Internet, quickly, but it would hardly increase traffic levels on the Internet,
because Internet bytes have always been concentrated in the large because Internet bytes have always been concentrated in the large
flows. It further shows that the performance of many typical flows. It further shows that the performance of many typical
applications depends on completion of long serial chains of short applications depends on completion of long serial chains of short
messages. It argues that, given most of the value people get from messages. It argues that, given most of the value people get from
the Internet is concentrated within short flows, this simple the Internet is concentrated within short flows, this simple
expedient would greatly increase the value of the best efforts expedient would greatly increase the value of the best efforts
Internet at minimal cost. Internet at minimal cost. A similar but more extensive approach has
been evaluated on Google servers [GentleAggro].
The proposals discussed in this sub-section are experimental
approaches that are not yet in wide operational use, but they are
existence proofs that transports can make themselves robust against
loss of control packets. The examples are all TCP-based, but
applications over non-TCP transports could mitigate loss of control
packets by making similar use of Diffserv, data duplication, FEC etc.
4.2.4. Congestion Notification: Summary of Conflicting Advice 4.2.4. Congestion Notification: Summary of Conflicting Advice
+-----------+----------------+-----------------+--------------------+ +-----------+----------------+-----------------+--------------------+
| transport | RED_1 (packet | RED_4 (linear | RED_5 (square byte | | transport | RED_1 (packet | RED_4 (linear | RED_5 (square byte |
| cc | mode drop) | byte mode drop) | mode drop) | | cc | mode drop) | byte mode drop) | mode drop) |
+-----------+----------------+-----------------+--------------------+ +-----------+----------------+-----------------+--------------------+
| TCP or | s/sqrt(p) | sqrt(s/p) | 1/sqrt(p) | | TCP or | s/sqrt(p) | sqrt(s/p) | 1/sqrt(p) |
| TFRC | | | | | TFRC | | | |
| TFRC-SP | 1/sqrt(p) | 1/sqrt(sp) | 1/(s.sqrt(p)) | | TFRC-SP | 1/sqrt(p) | 1/sqrt(sp) | 1/(s.sqrt(p)) |
skipping to change at page 27, line 4 skipping to change at page 27, line 19
o When network equipment decides whether to drop (or mark) a packet, o When network equipment decides whether to drop (or mark) a packet,
it is recommended that the size of the particular packet should it is recommended that the size of the particular packet should
not be taken into account not be taken into account
o However, when a transport algorithm responds to a dropped or o However, when a transport algorithm responds to a dropped or
marked packet, the size of the rate reduction should be marked packet, the size of the rate reduction should be
proportionate to the size of the packet. proportionate to the size of the packet.
In summary, the answers are 'it depends', 'no' and 'yes' respectively In summary, the answers are 'it depends', 'no' and 'yes' respectively
For the specific case of RED, this means that byte-mode queue For the specific case of RED, this means that byte-mode queue
measurement will often be appropriate although byte-mode drop is measurement will often be appropriate but the use of byte-mode drop
strongly deprecated. is very strongly discouraged.
At the transport layer the IETF should continue updating congestion At the transport layer the IETF should continue updating congestion
control protocols to take account of the size of each packet that control protocols to take account of the size of each packet that
indicates congestion. Also the IETF should continue to make indicates congestion. Also the IETF should continue to make
protocols less sensitive to losing control packets like SYNs, pure protocols less sensitive to losing control packets like SYNs, pure
ACKs and DNS exchanges. Although many control packets happen to be ACKs and DNS exchanges. Although many control packets happen to be
small, the alternative of network equipment favouring all small small, the alternative of network equipment favouring all small
packets would be dangerous. That would create perverse incentives to packets would be dangerous. That would create perverse incentives to
split data transfers into smaller packets. split data transfers into smaller packets.
skipping to change at page 29, line 5 skipping to change at page 29, line 24
www.stanford.edu/~balaji/papers/ www.stanford.edu/~balaji/papers/
01approximatefair.pdf}>. 01approximatefair.pdf}>.
[DRQ] Shin, M., Chong, S., and I. Rhee, "Dual- [DRQ] Shin, M., Chong, S., and I. Rhee, "Dual-
Resource TCP/AQM for Processing- Resource TCP/AQM for Processing-
Constrained Networks", IEEE/ACM Constrained Networks", IEEE/ACM
Transactions on Networking Vol 16, issue Transactions on Networking Vol 16, issue
2, April 2008, <http://dx.doi.org/10.1109/ 2, April 2008, <http://dx.doi.org/10.1109/
TNET.2007.900415>. TNET.2007.900415>.
[DupTCP] Wischik, D., "Short messages", Royal [DupTCP] Wischik, D., "Short messages",
Society workshop on networks: modelling Philosphical Transactions of the Royal
and control , September 2007, <http:// Society A 366(1872):1941-1953, June 2008,
www.cs.ucl.ac.uk/staff/ucacdjw/Research/ <http://rsta.royalsocietypublishing.org/
shortmsg.html>. content/366/1872/1941.full.pdf+html>.
[ECNFixedWireless] Siris, V., "Resource Control for Elastic [ECNFixedWireless] Siris, V., "Resource Control for Elastic
Traffic in CDMA Networks", Proc. ACM Traffic in CDMA Networks", Proc. ACM
MOBICOM'02 , September 2002, <http:// MOBICOM'02 , September 2002, <http://
www.ics.forth.gr/netlab/publications/ www.ics.forth.gr/netlab/publications/
resource_control_elastic_cdma.html>. resource_control_elastic_cdma.html>.
[Evol_cc] Gibbens, R. and F. Kelly, "Resource [Evol_cc] Gibbens, R. and F. Kelly, "Resource
pricing and the evolution of congestion pricing and the evolution of congestion
control", Automatica 35(12)1969--1985, control", Automatica 35(12)1969--1985,
December 1999, <http:// December 1999, <http://
www.statslab.cam.ac.uk/~frank/evol.html>. www.statslab.cam.ac.uk/~frank/evol.html>.
[GentleAggro] Flach, T., Dukkipati, N., Terzis, A.,
Raghavan, B., Cardwell, N., Cheng, Y.,
Jain, A., Hao, S., Katz-Bassett, E., and
R. Govindan, "Reducing Web Latency: the
Virtue of Gentle Aggression", ACM SIGCOMM
CCR 43(4)159--170, August 2013, <http://
doi.acm.org/10.1145/2486001.2486014>.
[I-D.nichols-tsvwg-codel] Nichols, K. and V. Jacobson, "Controlled [I-D.nichols-tsvwg-codel] Nichols, K. and V. Jacobson, "Controlled
Delay Active Queue Management", Delay Active Queue Management",
draft-nichols-tsvwg-codel-01 (work in draft-nichols-tsvwg-codel-01 (work in
progress), February 2013. progress), February 2013.
[I-D.pan-tsvwg-pie] Pan, R., Natarajan, P., Piglione, C., and [I-D.pan-tsvwg-pie] Pan, R., Natarajan, P., Piglione, C., and
M. Prabhu, "PIE: A Lightweight Control M. Prabhu, "PIE: A Lightweight Control
Scheme To Address the Bufferbloat Scheme To Address the Bufferbloat
Problem", draft-pan-tsvwg-pie-00 (work in Problem", draft-pan-tsvwg-pie-00 (work in
progress), December 2012. progress), December 2012.
skipping to change at page 40, line 5 skipping to change at page 40, line 35
across different size flows [Rate_fair_Dis]. across different size flows [Rate_fair_Dis].
Appendix D. Changes from Previous Versions Appendix D. Changes from Previous Versions
To be removed by the RFC Editor on publication. To be removed by the RFC Editor on publication.
Full incremental diffs between each version are available at Full incremental diffs between each version are available at
<http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-byte-pkt-congest/> <http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-byte-pkt-congest/>
(courtesy of the rfcdiff tool): (courtesy of the rfcdiff tool):
From -11 to -12: Following the second pass through the IESG:
* Section 2.1 [Barry Leiba]:
+ s/No other choice makes sense,/Subject to the exceptions
below, no other choice makes sense,/
+ s/Exceptions to these recommendations MAY be necessary
/Exceptions to these recommendations may be necessary /
* Sections 3.2 and 4.2.3 [Joel Jaeggli]:
+ Added comment to section 4.2.3 that the examples given are
not in widespread production use, but they give evidence
that it is possible to follow the advice given.
+ Section 4.2.3:
- OLD: Although there are no known proposals, it would also
be possible and perfectly valid to make control packets
robust against drop by explicitly requesting a lower drop
probability using their Diffserv code point [RFC2474] to
request a scheduling class with lower drop.
NEW: Although there are no known proposals, it would also
be possible and perfectly valid to make control packets
robust against drop by requesting a scheduling class with
lower drop probability, by re-marking to a Diffserv code
point [RFC2474] within the same behaviour aggregate.
- appended "Similarly applications, over non-TCP transports
could make any packets that are effectively control
packets more robust by using Diffserv, data duplication,
FEC etc."
+ Updated Wischik ref and added "Reducing Web Latency: the
Virtue of Gentle Aggression" ref.
* Expanded more abbreviations (CoDel, PIE, MTU).
* Section 1. Intro [Stephen Farrell]:
+ In the places where the doc desribes the dichotomy between
'long-term goal' and 'expediency' the words long term goal
and expedient have been introduced, to more explicitly refer
back to this introductory para (S.2.1 & S.2.3).
+ Added explanation of what scaling with packet size means.
* Conclusions [Benoit Claise]:
+ OLD: For the specific case of RED, this means that byte-mode
queue measurement will often be appropriate although byte-
mode drop is strongly deprecated.
NEW: For the specific case of RED, this means that byte-mode
queue measurement will often be appropriate but the use of
byte-mode drop is very strongly discouraged.
From -10 to -11: Following a further WGLC: From -10 to -11: Following a further WGLC:
* Abstract: clarified that advice applies to all AQMs including * Abstract: clarified that advice applies to all AQMs including
newer ones newer ones
* Abstract & Intro: changed 'read' to 'detect', because you don't * Abstract & Intro: changed 'read' to 'detect', because you don't
read losses, you detect them. read losses, you detect them.
* S.1. Introduction: Disambiguated summary of advice on queue * S.1. Introduction: Disambiguated summary of advice on queue
measurement. measurement.
 End of changes. 26 change blocks. 
64 lines changed or deleted 142 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/