draft-ietf-tsvwg-byte-pkt-congest-05.txt   draft-ietf-tsvwg-byte-pkt-congest-06.txt 
Transport Area Working Group B. Briscoe Transport Area Working Group B. Briscoe
Internet-Draft BT Internet-Draft BT
Updates: 2309 (if approved) J. Manner Updates: 2309 (if approved) J. Manner
Intended status: BCP Aalto University Intended status: BCP Aalto University
Expires: May 3, 2012 October 31, 2011 Expires: August 23, 2012 February 20, 2012
Byte and Packet Congestion Notification Byte and Packet Congestion Notification
draft-ietf-tsvwg-byte-pkt-congest-05 draft-ietf-tsvwg-byte-pkt-congest-06
Abstract Abstract
This memo concerns dropping or marking packets using active queue This memo concerns dropping or marking packets using active queue
management (AQM) such as random early detection (RED) or pre- management (AQM) such as random early detection (RED) or pre-
congestion notification (PCN). We give three strong recommendations: congestion notification (PCN). We give three strong recommendations:
(1) packet size should be taken into account when transports read and (1) packet size should be taken into account when transports read and
respond to congestion indications, (2) packet size should not be respond to congestion indications, (2) packet size should not be
taken into account when network equipment creates congestion signals taken into account when network equipment creates congestion signals
(marking, dropping), and therefore (3) the byte-mode packet drop (marking, dropping), and therefore (3) the byte-mode packet drop
skipping to change at page 1, line 39 skipping to change at page 1, line 39
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 3, 2012. This Internet-Draft will expire on August 23, 2012.
Copyright Notice Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
skipping to change at page 2, line 37 skipping to change at page 2, line 37
4. A Survey and Critique of Past Advice . . . . . . . . . . . . . 16 4. A Survey and Critique of Past Advice . . . . . . . . . . . . . 16
4.1. Congestion Measurement Advice . . . . . . . . . . . . . . 16 4.1. Congestion Measurement Advice . . . . . . . . . . . . . . 16
4.1.1. Fixed Size Packet Buffers . . . . . . . . . . . . . . 17 4.1.1. Fixed Size Packet Buffers . . . . . . . . . . . . . . 17
4.1.2. Congestion Measurement without a Queue . . . . . . . . 18 4.1.2. Congestion Measurement without a Queue . . . . . . . . 18
4.2. Congestion Notification Advice . . . . . . . . . . . . . . 19 4.2. Congestion Notification Advice . . . . . . . . . . . . . . 19
4.2.1. Network Bias when Encoding . . . . . . . . . . . . . . 19 4.2.1. Network Bias when Encoding . . . . . . . . . . . . . . 19
4.2.2. Transport Bias when Decoding . . . . . . . . . . . . . 21 4.2.2. Transport Bias when Decoding . . . . . . . . . . . . . 21
4.2.3. Making Transports Robust against Control Packet 4.2.3. Making Transports Robust against Control Packet
Losses . . . . . . . . . . . . . . . . . . . . . . . . 22 Losses . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2.4. Congestion Notification: Summary of Conflicting 4.2.4. Congestion Notification: Summary of Conflicting
Advice . . . . . . . . . . . . . . . . . . . . . . . . 23 Advice . . . . . . . . . . . . . . . . . . . . . . . . 22
5. Outstanding Issues and Next Steps . . . . . . . . . . . . . . 24 5. Outstanding Issues and Next Steps . . . . . . . . . . . . . . 24
5.1. Bit-congestible Network . . . . . . . . . . . . . . . . . 24 5.1. Bit-congestible Network . . . . . . . . . . . . . . . . . 24
5.2. Bit- & Packet-congestible Network . . . . . . . . . . . . 24 5.2. Bit- & Packet-congestible Network . . . . . . . . . . . . 24
6. Security Considerations . . . . . . . . . . . . . . . . . . . 24 6. Security Considerations . . . . . . . . . . . . . . . . . . . 24
7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 25 7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 25
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 26 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 26
9. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 27 9. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 27
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 27 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 27
10.1. Normative References . . . . . . . . . . . . . . . . . . . 27 10.1. Normative References . . . . . . . . . . . . . . . . . . . 27
10.2. Informative References . . . . . . . . . . . . . . . . . . 27 10.2. Informative References . . . . . . . . . . . . . . . . . . 27
skipping to change at page 7, line 34 skipping to change at page 7, line 34
measurement" or "byte-mode queue measurement". And whether the measurement" or "byte-mode queue measurement". And whether the
probability of dropping a particular packet is independent or probability of dropping a particular packet is independent or
dependent on its byte-size is called respectively "packet-mode dependent on its byte-size is called respectively "packet-mode
drop" or "byte-mode drop". The terms byte-mode and packet-mode drop" or "byte-mode drop". The terms byte-mode and packet-mode
should not be used without specifying whether they apply to queue should not be used without specifying whether they apply to queue
measurement or to drop. measurement or to drop.
1.2. Example Comparing Packet-Mode Drop and Byte-Mode Drop 1.2. Example Comparing Packet-Mode Drop and Byte-Mode Drop
A central question addressed by this document is whether to recommend A central question addressed by this document is whether to recommend
RED's packet-mode drop and to deprecate byte-mode drop. Table 1 that AQM uses RED's packet-mode drop and to deprecate byte-mode drop.
compares how packet-mode and byte-mode drop affect two flows of Table 1 compares how packet-mode and byte-mode drop affect two flows
different size packets. For each it gives the expected number of of different size packets. For each it gives the expected number of
packets and of bits dropped in one second. Each example flow runs at packets and of bits dropped in one second. Each example flow runs at
the same bit-rate of 48Mb/s, but one is broken up into small 60 byte the same bit-rate of 48Mb/s, but one is broken up into small 60 byte
packets and the other into large 1500 byte packets. packets and the other into large 1500 byte packets.
To keep up the same bit-rate, in one second there are about 25 times To keep up the same bit-rate, in one second there are about 25 times
more small packets because they are 25 times smaller. As can be seen more small packets because they are 25 times smaller. As can be seen
from the table, the packet rate is 100,000 small packets versus 4,000 from the table, the packet rate is 100,000 small packets versus 4,000
large packets per second (pps). large packets per second (pps).
Parameter Formula Small packets Large packets Parameter Formula Small packets Large packets
skipping to change at page 9, line 4 skipping to change at page 8, line 49
that it drops a small packet 25 times smaller at 0.004%. But there that it drops a small packet 25 times smaller at 0.004%. But there
are 25 times more small packets, so dropping them with 25 times lower are 25 times more small packets, so dropping them with 25 times lower
probability results in dropping the same number of packets: 4 drops probability results in dropping the same number of packets: 4 drops
in both cases. The 4 small dropped packets contain 25 times less in both cases. The 4 small dropped packets contain 25 times less
bits than the 4 large dropped packets: 1,920 compared to 48,000. bits than the 4 large dropped packets: 1,920 compared to 48,000.
The byte-mode drop algorithm drops any bit with a probability The byte-mode drop algorithm drops any bit with a probability
proportionate to the size of the packet it is in. proportionate to the size of the packet it is in.
2. Recommendations 2. Recommendations
This section gives recommendations related to network equipment in
Sections 2.1 and 2.2, and in Sections 2.3 and 2.4 we discuss the
implications on the transport protocols.
2.1. Recommendation on Queue Measurement 2.1. Recommendation on Queue Measurement
Queue length is usually the most correct and simplest way to measure Queue length is usually the most correct and simplest way to measure
congestion of a resource. To avoid the pathological effects of drop congestion of a resource. To avoid the pathological effects of drop
tail, an AQM function can then be used to transform queue length into tail, an AQM function can then be used to transform queue length into
the probability of dropping or marking a packet (e.g. RED's the probability of dropping or marking a packet (e.g. RED's
piecewise linear function between thresholds). piecewise linear function between thresholds).
If the resource is bit-congestible, the implementation SHOULD measure If the resource is bit-congestible, the implementation SHOULD measure
the length of the queue in bytes. If the resource is packet- the length of the queue in bytes. If the resource is packet-
congestible, the implementation SHOULD measure the length of the congestible, the implementation SHOULD measure the length of the
queue in packets. No other choice makes sense, because the number of queue in packets. No other choice makes sense, because the number of
packets waiting in the queue isn't relevant if the resource gets packets waiting in the queue isn't relevant if the resource gets
congested by bytes and vice versa. congested by bytes and vice versa.
Corollaries: What this advice means for the case of RED:
1. A RED implementation SHOULD use byte mode queue measurement for 1. A RED implementation SHOULD use byte mode queue measurement for
measuring the congestion of bit-congestible resources and packet measuring the congestion of bit-congestible resources and packet
mode queue measurement for packet-congestible resources. mode queue measurement for packet-congestible resources.
2. An implementation SHOULD NOT make it possible to configure the 2. An implementation SHOULD NOT make it possible to configure the
way a queue measures itself, because whether a queue is bit- way a queue measures itself, because whether a queue is bit-
congestible or packet-congestible is an inherent property of the congestible or packet-congestible is an inherent property of the
queue. queue.
skipping to change at page 9, line 50 skipping to change at page 9, line 51
on the size of the packet in question. As the example in Section 1.2 on the size of the packet in question. As the example in Section 1.2
illustrates, to drop any bit with probability 0.1% it is only illustrates, to drop any bit with probability 0.1% it is only
necessary to drop every packet with probability 0.1% without regard necessary to drop every packet with probability 0.1% without regard
to the size of each packet. to the size of each packet.
This approach ensures the network layer offers sufficient congestion This approach ensures the network layer offers sufficient congestion
information for all known and future transport protocols and also information for all known and future transport protocols and also
ensures no perverse incentives are created that would encourage ensures no perverse incentives are created that would encourage
transports to use inappropriately small packet sizes. transports to use inappropriately small packet sizes.
Corollaries: What this advice means for the case of TCP:
1. AQM algorithms such as RED SHOULD NOT use byte-mode drop, which 1. AQM algorithms such as RED SHOULD NOT use byte-mode drop, which
deflates RED's drop probability for smaller packet sizes. RED's deflates RED's drop probability for smaller packet sizes. RED's
byte-mode drop has no enduring advantages. It is more complex, byte-mode drop has no enduring advantages. It is more complex,
it creates the perverse incentive to fragment segments into tiny it creates the perverse incentive to fragment segments into tiny
pieces and it reopens the vulnerability to floods of small- pieces and it reopens the vulnerability to floods of small-
packets that drop-tail queues suffered from and AQM was designed packets that drop-tail queues suffered from and AQM was designed
to remove. to remove.
2. If a vendor has implemented byte-mode drop, and an operator has 2. If a vendor has implemented byte-mode drop, and an operator has
turned it on, it is RECOMMENDED to turn it off. Note that RED as turned it on, it is RECOMMENDED to turn it off. Note that RED as
a whole SHOULD NOT be turned off, as without it, a drop tail a whole SHOULD NOT be turned off, as without it, a drop tail
queue also biases against large packets. But note also that queue also biases against large packets. But note also that
turning off byte-mode drop may alter the relative performance of turning off byte-mode drop may alter the relative performance of
applications using different packet sizes, so it would be applications using different packet sizes, so it would be
advisable to establish the implications before turning it off. advisable to establish the implications before turning it off.
NOTE WELL that RED's byte-mode queue drop is completely Note well that RED's byte-mode queue drop is completely
orthogonal to byte-mode queue measurement and should not be orthogonal to byte-mode queue measurement and should not be
confused with it. If a RED implementation has a byte-mode but confused with it. If a RED implementation has a byte-mode but
does not specify what sort of byte-mode, it is most probably does not specify what sort of byte-mode, it is most probably
byte-mode queue measurement, which is fine. However, if in byte-mode queue measurement, which is fine. However, if in
doubt, the vendor should be consulted. doubt, the vendor should be consulted.
A survey (Appendix A) showed that there appears to be little, if any, A survey (Appendix A) showed that there appears to be little, if any,
installed base of the byte-mode drop variant of RED. This suggests installed base of the byte-mode drop variant of RED. This suggests
that deprecating byte-mode drop will have little, if any, incremental that deprecating byte-mode drop will have little, if any, incremental
deployment impact. deployment impact.
skipping to change at page 11, line 36 skipping to change at page 11, line 36
merging or splitting. This is based on the principle used above; merging or splitting. This is based on the principle used above;
that an indication of congestion on a packet can be considered as an that an indication of congestion on a packet can be considered as an
indication of congestion on each octet of the packet. indication of congestion on each octet of the packet.
The above rule is not phrased with the word "MUST" to allow the The above rule is not phrased with the word "MUST" to allow the
following exception. There are cases where pre-existing protocols following exception. There are cases where pre-existing protocols
were not designed to conserve congestion marked octets (e.g. IP were not designed to conserve congestion marked octets (e.g. IP
fragment reassembly [RFC3168] or loss statistics in RTCP receiver fragment reassembly [RFC3168] or loss statistics in RTCP receiver
reports [RFC3550] before ECN was added reports [RFC3550] before ECN was added
[I-D.ietf-avtcore-ecn-for-rtp]). When any such protocol is updated, [I-D.ietf-avtcore-ecn-for-rtp]). When any such protocol is updated,
it SHOULD comply with the above rule to conserved marked octets. it SHOULD comply with the above rule to conserve marked octets.
However, the rule may be relaxed if it would otherwise become too However, the rule may be relaxed if it would otherwise become too
complex to interoperate with pre-existing implementations of the complex to interoperate with pre-existing implementations of the
protocol. protocol.
One can think of a splitting or merging process as if all the One can think of a splitting or merging process as if all the
incoming congestion-marked octets increment a counter and all the incoming congestion-marked octets increment a counter and all the
outgoing marked octets decrement the same counter. In order to outgoing marked octets decrement the same counter. In order to
ensure that congestion indications remain timely, even the smallest ensure that congestion indications remain timely, even the smallest
positive remainder in the conceptual counter should trigger the next positive remainder in the conceptual counter should trigger the next
outgoing packet to be marked (causing the counter to go negative). outgoing packet to be marked (causing the counter to go negative).
3. Motivating Arguments 3. Motivating Arguments
In this section, we justify the recommendations given in the previous This section is informative. It justifies the recommendations given
section. in the previous section.
3.1. Avoiding Perverse Incentives to (Ab)use Smaller Packets 3.1. Avoiding Perverse Incentives to (Ab)use Smaller Packets
Increasingly, it is being recognised that a protocol design must take Increasingly, it is being recognised that a protocol design must take
care not to cause unintended consequences by giving the parties in care not to cause unintended consequences by giving the parties in
the protocol exchange perverse incentives [Evol_cc][RFC3426]. Given the protocol exchange perverse incentives [Evol_cc][RFC3426]. Given
there are many good reasons why larger path max transmission units there are many good reasons why larger path maximum transmission
(PMTUs) would help solve a number of scaling issues, we do not want units (PMTUs) would help solve a number of scaling issues, we do not
to create any bias against large packets that is greater than their want to create any bias against large packets that is greater than
true cost. their true cost.
Imagine a scenario where the same bit rate of packets will contribute Imagine a scenario where the same bit rate of packets will contribute
the same to bit-congestion of a link irrespective of whether it is the same to bit-congestion of a link irrespective of whether it is
sent as fewer larger packets or more smaller packets. A protocol sent as fewer larger packets or more smaller packets. A protocol
design that caused larger packets to be more likely to be dropped design that caused larger packets to be more likely to be dropped
than smaller ones would be dangerous in this case: than smaller ones would be dangerous in both the following cases:
Malicious transports: A queue that gives an advantage to small Malicious transports: A queue that gives an advantage to small
packets can be used to amplify the force of a flooding attack. By packets can be used to amplify the force of a flooding attack. By
sending a flood of small packets, the attacker can get the queue sending a flood of small packets, the attacker can get the queue
to discard more traffic in large packets, allowing more attack to discard more traffic in large packets, allowing more attack
traffic to get through to cause further damage. Such a queue traffic to get through to cause further damage. Such a queue
allows attack traffic to have a disproportionately large effect on allows attack traffic to have a disproportionately large effect on
regular traffic without the attacker having to do much work. regular traffic without the attacker having to do much work.
Non-malicious transports: Even if a transport designer is not Non-malicious transports: Even if a transport designer is not
skipping to change at page 15, line 27 skipping to change at page 15, line 27
Imagine a bit-congestible link shared by many flows, so that each Imagine a bit-congestible link shared by many flows, so that each
busy period tends to cause packets to be lost from different flows. busy period tends to cause packets to be lost from different flows.
Consider further two sources that have the same data rate but break Consider further two sources that have the same data rate but break
the load into large packets in one application (A) and small packets the load into large packets in one application (A) and small packets
in the other (B). Of course, because the load is the same, there in the other (B). Of course, because the load is the same, there
will be proportionately more packets in the small packet flow (B). will be proportionately more packets in the small packet flow (B).
If a congestion control scales with packet size it should respond in If a congestion control scales with packet size it should respond in
the same way to the same congestion notification, irrespective of the the same way to the same congestion notification, irrespective of the
size of the packets that the bytes causing congestion happen to be size of the packets containing the bytes that contribute to
broken down into. congestion.
A bit-congestible queue suffering congestion has to drop or mark the A bit-congestible queue suffering congestion has to drop or mark the
same excess bytes whether they are in a few large packets (A) or many same excess bytes whether they are in a few large packets (A) or many
small packets (B). So for the same amount of congestion overload, small packets (B). So for the same amount of congestion overload,
the same amount of bytes has to be shed to get the load back to its the same amount of bytes has to be shed to get the load back to its
operating point. But, of course, for smaller packets (B) more operating point. For smaller packets (B) more packets will have to
packets will have to be discarded to shed the same bytes. be discarded to shed the same bytes.
If both the transports interpret each drop/mark as a single loss If both the transports interpret each drop/mark as a single loss
event irrespective of the size of the packet dropped, the flow of event irrespective of the size of the packet dropped, the flow of
smaller packets (B) will respond more times to the same congestion. smaller packets (B) will respond more times to the same congestion.
On the other hand, if a transport responds proportionately less when On the other hand, if a transport responds proportionately less when
smaller packets are dropped/marked, overall it will be able to smaller packets are dropped/marked, overall it will be able to
respond the same to the same amount of congestion. respond the same to the same amount of congestion.
Therefore, for a congestion control to scale with packet size it Therefore, for a congestion control to scale with packet size it
should respond to dropped or marked bytes (as TFRC-SP [RFC4828] should respond to dropped or marked bytes (as TFRC-SP [RFC4828]
skipping to change at page 17, line 5 skipping to change at page 17, line 5
The rest of this section is structured accordingly. The rest of this section is structured accordingly.
4.1. Congestion Measurement Advice 4.1. Congestion Measurement Advice
The choice of which metric to use to measure queue length was left The choice of which metric to use to measure queue length was left
open in RFC2309. It is now well understood that queues for bit- open in RFC2309. It is now well understood that queues for bit-
congestible resources should be measured in bytes, and queues for congestible resources should be measured in bytes, and queues for
packet-congestible resources should be measured in packets packet-congestible resources should be measured in packets
[pktByteEmail]. [pktByteEmail].
Some modern queue implementations give a choice for setting RED's Congestion in some legacy bit-congestible buffers is only measured in
thresholds in byte-mode or packet-mode. This may merely be an packets not bytes. In such cases, the operator has to set the
administrator-interface preference, not altering how the queue itself thresholds mindful of a typical mix of packets sizes. Any AQM
is measured but on some hardware it does actually change the way it algorithm on such a buffer will be oversensitive to high proportions
measures its queue. Whether a resource is bit-congestible or packet- of small packets, e.g. a DoS attack, and undersensitive to high
congestible is a property of the resource, so an admin should not proportions of large packets. However, there is no need to make
ever need to, or be able to, configure the way a queue measures allowances for the possibility of such legacy in future protocol
itself.
NOTE: Congestion in some legacy bit-congestible buffers is only
measured in packets not bytes. In such cases, the operator has to
set the thresholds mindful of a typical mix of packets sizes. Any
AQM algorithm on such a buffer will be oversensitive to high
proportions of small packets, e.g. a DoS attack, and undersensitive
to high proportions of large packets. However, there is no need to
make allowances for the possibility of such legacy in future protocol
design. This is safe because any undersensitivity during unusual design. This is safe because any undersensitivity during unusual
traffic mixes cannot lead to congestion collapse given the buffer traffic mixes cannot lead to congestion collapse given the buffer
will eventually revert to tail drop, discarding proportionately more will eventually revert to tail drop, discarding proportionately more
large packets. large packets.
4.1.1. Fixed Size Packet Buffers 4.1.1. Fixed Size Packet Buffers
The question of whether to measure queues in bytes or packets seems The question of whether to measure queues in bytes or packets seems
to be well understood. However, measuring congestion is not to be well understood. However, measuring congestion is not
straightforward when the resource is bit congestible but the queue is straightforward when the resource is bit congestible but the queue is
skipping to change at page 24, line 44 skipping to change at page 24, line 42
The position is much less clear-cut if the Internet becomes populated The position is much less clear-cut if the Internet becomes populated
by a more even mix of both packet-congestible and bit-congestible by a more even mix of both packet-congestible and bit-congestible
resources (see Appendix B.2). This problem is not pressing, because resources (see Appendix B.2). This problem is not pressing, because
most Internet resources are designed to be bit-congestible before most Internet resources are designed to be bit-congestible before
packet processing starts to congest (see Section 1.1). packet processing starts to congest (see Section 1.1).
The IRTF Internet congestion control research group (ICCRG) has set The IRTF Internet congestion control research group (ICCRG) has set
itself the task of reaching consensus on generic forwarding itself the task of reaching consensus on generic forwarding
mechanisms that are necessary and sufficient to support the mechanisms that are necessary and sufficient to support the
Internet's future congestion control requirements (the first Internet's future congestion control requirements (the first
challenge in [RFC6077]). Therefore, we defer the question of whether challenge in [RFC6077]). The research question of whether packet
packet congestion might become common and what to do if it does to congestion might become common and what to do if it does may in the
the IRTF (the 'Small Packets' challenge in [RFC6077]). future be explored in the IRTF (the "Challenge 3: Packet Size" in
[RFC6077]).
6. Security Considerations 6. Security Considerations
This memo recommends that queues do not bias drop probability towards This memo recommends that queues do not bias drop probability towards
small packets as this creates a perverse incentive for transports to small packets as this creates a perverse incentive for transports to
break down their flows into tiny segments. One of the benefits of break down their flows into tiny segments. One of the benefits of
implementing AQM was meant to be to remove this perverse incentive implementing AQM was meant to be to remove this perverse incentive
that drop-tail queues gave to small packets. that drop-tail queues gave to small packets.
In practice, transports cannot all be trusted to respond to In practice, transports cannot all be trusted to respond to
skipping to change at page 25, line 38 skipping to change at page 25, line 37
summary, it says that making drop probability depend on the size of summary, it says that making drop probability depend on the size of
the packets that bits happen to be divided into simply encourages the the packets that bits happen to be divided into simply encourages the
bits to be divided into smaller packets. Byte-mode drop would bits to be divided into smaller packets. Byte-mode drop would
therefore irreversibly complicate any attempt to fix the Internet's therefore irreversibly complicate any attempt to fix the Internet's
incentive structures. incentive structures.
7. Conclusions 7. Conclusions
This memo identifies the three distinct stages of the congestion This memo identifies the three distinct stages of the congestion
notification process where implementations need to decide whether to notification process where implementations need to decide whether to
take packet size into account. The recommendation of this memo is take packet size into account. The recommendations provided in
different in each case: Section 2 of this memo are different in each case:
o When network equipment measures the length of a queue, whether it o When network equipment measures the length of a queue, whether it
counts in bytes or packets depends on whether the network resource counts in bytes or packets depends on whether the network resource
is congested respectively by bytes or by packets. is congested respectively by bytes or by packets.
o When network equipment decides whether to drop (or mark) a packet, o When network equipment decides whether to drop (or mark) a packet,
it is recommended that the size of the particular packet should it is recommended that the size of the particular packet should
not be taken into account not be taken into account
o However, when a transport algorithm responds to a dropped or o However, when a transport algorithm responds to a dropped or
marked packet, the size of the rate reduction should be marked packet, the size of the rate reduction should be
proportionate to the size of the packet. proportionate to the size of the packet.
In summary, the answers are 'it depends', 'no' and 'yes' respectively In summary, the answers are 'it depends', 'no' and 'yes' respectively
This means that RED's byte-mode queue measurement will often be For the specific case of RED, this means that byte-mode queue
appropriate although byte-mode drop is strongly deprecated. measurement will often be appropriate although byte-mode drop is
strongly deprecated.
At the transport layer the IETF should continue updating congestion At the transport layer the IETF should continue updating congestion
control protocols to take account of the size of each packet that control protocols to take account of the size of each packet that
indicates congestion. Also the IETF should continue to make indicates congestion. Also the IETF should continue to make
protocols less sensitive to losing control packets like SYNs, pure protocols less sensitive to losing control packets like SYNs, pure
ACKs and DNS exchanges. Although many control packets happen to be ACKs and DNS exchanges. Although many control packets happen to be
small, the alternative of network equipment favouring all small small, the alternative of network equipment favouring all small
packets would be dangerous. That would create perverse incentives to packets would be dangerous. That would create perverse incentives to
split data transfers into smaller packets. split data transfers into smaller packets.
skipping to change at page 26, line 32 skipping to change at page 26, line 33
such as specific buffer architectures and incremental deployment. such as specific buffer architectures and incremental deployment.
Indeed a limited survey of RED implementations is discussed, which Indeed a limited survey of RED implementations is discussed, which
shows there appears to be little, if any, installed base of RED's shows there appears to be little, if any, installed base of RED's
byte-mode drop. Therefore it can be deprecated with little, if any, byte-mode drop. Therefore it can be deprecated with little, if any,
incremental deployment complications. incremental deployment complications.
The recommendations have been developed on the well-founded basis The recommendations have been developed on the well-founded basis
that most Internet resources are bit-congestible not packet- that most Internet resources are bit-congestible not packet-
congestible. We need to know the likelihood that this assumption congestible. We need to know the likelihood that this assumption
will prevail longer term and, if it might not, what protocol changes will prevail longer term and, if it might not, what protocol changes
will be needed to cater for a mix of the two. This problem is will be needed to cater for a mix of the two. The IRTF Internet
deferred to the IRTF Internet Congestion Control Research Group Congestion Control Research Group (ICCRG) is currently working on
(ICCRG). these problems [RFC6077].
8. Acknowledgements 8. Acknowledgements
Thank you to Sally Floyd, who gave extensive and useful review Thank you to Sally Floyd, who gave extensive and useful review
comments. Also thanks for the reviews from Philip Eardley, David comments. Also thanks for the reviews from Philip Eardley, David
Black, Fred Baker, Toby Moncaster, Arnaud Jacquet and Mirja Black, Fred Baker, Toby Moncaster, Arnaud Jacquet and Mirja
Kuehlewind as well as helpful explanations of different hardware Kuehlewind as well as helpful explanations of different hardware
approaches from Larry Dunn and Fred Baker. We are grateful to Bruce approaches from Larry Dunn and Fred Baker. We are grateful to Bruce
Davie and his colleagues for providing a timely and efficient survey Davie and his colleagues for providing a timely and efficient survey
of RED implementation in Cisco's product range. Also grateful thanks of RED implementation in Cisco's product range. Also grateful thanks
to Toby Moncaster, Will Dormann, John Regnault, Simon Carter and to Toby Moncaster, Will Dormann, John Regnault, Simon Carter and
Stefaan De Cnodder who further helped survey the current status of Stefaan De Cnodder who further helped survey the current status of
RED implementation and deployment and, finally, thanks to the RED implementation and deployment and, finally, thanks to the
anonymous individuals who responded. anonymous individuals who responded.
Bob Briscoe and Jukka Manner are partly funded by Trilogy, a research Bob Briscoe and Jukka Manner were partly funded by Trilogy, a
project (ICT- 216372) supported by the European Community under its research project (ICT- 216372) supported by the European Community
Seventh Framework Programme. The views expressed here are those of under its Seventh Framework Programme. The views expressed here are
the authors only. those of the authors only.
9. Comments Solicited 9. Comments Solicited
Comments and questions are encouraged and very welcome. They can be Comments and questions are encouraged and very welcome. They can be
addressed to the IETF Transport Area working group mailing list addressed to the IETF Transport Area working group mailing list
<tsvwg@ietf.org>, and/or to the authors. <tsvwg@ietf.org>, and/or to the authors.
10. References 10. References
10.1. Normative References 10.1. Normative References
skipping to change at page 28, line 41 skipping to change at page 28, line 42
congestion control", congestion control",
Automatica 35(12)1969--1985, Automatica 35(12)1969--1985,
December 1999, <http:// December 1999, <http://
www.statslab.cam.ac.uk/~frank/ www.statslab.cam.ac.uk/~frank/
evol.html>. evol.html>.
[I-D.ietf-avtcore-ecn-for-rtp] Westerlund, M., Johansson, I., [I-D.ietf-avtcore-ecn-for-rtp] Westerlund, M., Johansson, I.,
Perkins, C., O'Hanlon, P., and K. Perkins, C., O'Hanlon, P., and K.
Carlberg, "Explicit Congestion Carlberg, "Explicit Congestion
Notification (ECN) for RTP over UDP", Notification (ECN) for RTP over UDP",
draft-ietf-avtcore-ecn-for-rtp-04 draft-ietf-avtcore-ecn-for-rtp-06
(work in progress), July 2011. (work in progress), February 2012.
[I-D.ietf-conex-concepts-uses] Briscoe, B., Woundy, R., and A. [I-D.ietf-conex-concepts-uses] Briscoe, B., Woundy, R., Moncaster,
Cooper, "ConEx Concepts and Use T., and J. Leslie, "ConEx Concepts
Cases", and Use Cases",
draft-ietf-conex-concepts-uses-03 draft-ietf-conex-concepts-uses-00
(work in progress), October 2011. (work in progress), November 2010.
[IOSArch] Bollapragada, V., White, R., and C. [IOSArch] Bollapragada, V., White, R., and C.
Murphy, "Inside Cisco IOS Software Murphy, "Inside Cisco IOS Software
Architecture", Cisco Press: CCIE Architecture", Cisco Press: CCIE
Professional Development ISBN13: 978- Professional Development ISBN13: 978-
1-57870-181-0, July 2000. 1-57870-181-0, July 2000.
[PktSizeEquCC] Vasallo, P., "Variable Packet Size [PktSizeEquCC] Vasallo, P., "Variable Packet Size
Equation-Based Congestion Control", Equation-Based Congestion Control",
ICSI Technical Report tr-00-008, ICSI Technical Report tr-00-008,
2000, <http://http.icsi.berkeley.edu/ 2000, <http://http.icsi.berkeley.edu/
ftp/global/pub/techreports/2000/ ftp/global/pub/techreports/2000/
skipping to change at page 31, line 14 skipping to change at page 31, line 14
[pBox] Floyd, S. and K. Fall, "Promoting the [pBox] Floyd, S. and K. Fall, "Promoting the
Use of End-to-End Congestion Control Use of End-to-End Congestion Control
in the Internet", IEEE/ACM in the Internet", IEEE/ACM
Transactions on Networking 7(4) 458-- Transactions on Networking 7(4) 458--
472, August 1999, <http:// 472, August 1999, <http://
www.aciri.org/floyd/ www.aciri.org/floyd/
end2end-paper.html>. end2end-paper.html>.
[pktByteEmail] Floyd, S., "RED: Discussions of Byte [pktByteEmail] Floyd, S., "RED: Discussions of Byte
and Packet Modes", email , and Packet Modes", Web page Red Queue
March 1997, <http:// Management, March 1997, <Available
www-nrg.ee.lbl.gov/floyd/ at: http://ee.lbl.gov/floyd/
REDaveraging.txt>. REDaveraging.txt>.
Appendix A. Survey of RED Implementation Status Appendix A. Survey of RED Implementation Status
This Appendix is informative, not normative. This Appendix is informative, not normative.
In May 2007 a survey was conducted of 84 vendors to assess how widely In May 2007 a survey was conducted of 84 vendors to assess how widely
drop probability based on packet size has been implemented in RED drop probability based on packet size has been implemented in RED
Table 3. About 19% of those surveyed replied, giving a sample size Table 3. About 19% of those surveyed replied, giving a sample size
of 16. Although in most cases we do not have permission to identify of 16. Although in most cases we do not have permission to identify
skipping to change at page 37, line 50 skipping to change at page 37, line 50
Finally, we note one further complication. Strictly, packet- Finally, we note one further complication. Strictly, packet-
congestible resources are often cycle-congestible. For instance, for congestible resources are often cycle-congestible. For instance, for
routing look-ups load depends on the complexity of each look-up and routing look-ups load depends on the complexity of each look-up and
whether the pattern of arrivals is amenable to caching or not. This whether the pattern of arrivals is amenable to caching or not. This
also reminds us that any solution must not require a forwarding also reminds us that any solution must not require a forwarding
engine to use excessive processor cycles in order to decide how to engine to use excessive processor cycles in order to decide how to
say it has no spare processor cycles. say it has no spare processor cycles.
Appendix C. Byte-mode Drop Complicates Policing Congestion Response Appendix C. Byte-mode Drop Complicates Policing Congestion Response
This section is informative, not normative.
There are two main classes of approach to policing congestion There are two main classes of approach to policing congestion
response: i) policing at each bottleneck link or ii) policing at the response: i) policing at each bottleneck link or ii) policing at the
edges of networks. Packet-mode drop in RED is compatible with edges of networks. Packet-mode drop in RED is compatible with
either, while byte-mode drop precludes edge policing. either, while byte-mode drop precludes edge policing.
The simplicity of an edge policer relies on one dropped or marked The simplicity of an edge policer relies on one dropped or marked
packet being equivalent to another of the same size without having to packet being equivalent to another of the same size without having to
know which link the drop or mark occurred at. However, the byte-mode know which link the drop or mark occurred at. However, the byte-mode
drop algorithm has to depend on the local MTU of the line--it needs drop algorithm has to depend on the local MTU of the line--it needs
to use some concept of a 'normal' packet size. Therefore, one to use some concept of a 'normal' packet size. Therefore, one
skipping to change at page 39, line 5 skipping to change at page 39, line 5
across different size flows [Rate_fair_Dis]. across different size flows [Rate_fair_Dis].
Appendix D. Changes from Previous Versions Appendix D. Changes from Previous Versions
To be removed by the RFC Editor on publication. To be removed by the RFC Editor on publication.
Full incremental diffs between each version are available at Full incremental diffs between each version are available at
<http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-byte-pkt-congest/> <http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-byte-pkt-congest/>
(courtesy of the rfcdiff tool): (courtesy of the rfcdiff tool):
From -05 to -06:
* Primarily editorial fixes.
From -04 to -05: From -04 to -05:
* Changed from Informational to BCP and highlighted non-normative * Changed from Informational to BCP and highlighted non-normative
sections and appendices sections and appendices
* Removed language about consensus * Removed language about consensus
* Added "Example Comparing Packet-Mode Drop and Byte-Mode Drop" * Added "Example Comparing Packet-Mode Drop and Byte-Mode Drop"
* Arranged "Motivating Arguments" into a more logical order and * Arranged "Motivating Arguments" into a more logical order and
skipping to change at page 43, line 12 skipping to change at page 43, line 12
EMail: bob.briscoe@bt.com EMail: bob.briscoe@bt.com
URI: http://bobbriscoe.net/ URI: http://bobbriscoe.net/
Jukka Manner Jukka Manner
Aalto University Aalto University
Department of Communications and Networking (Comnet) Department of Communications and Networking (Comnet)
P.O. Box 13000 P.O. Box 13000
FIN-00076 Aalto FIN-00076 Aalto
Finland Finland
Phone: +358 9 470 22481 Phone: +358 9 470 22481
EMail: jukka.manner@tkk.fi EMail: jukka.manner@aalto.fi
URI: http://www.netlab.tkk.fi/~jmanner/ URI: http://www.netlab.tkk.fi/~jmanner/
 End of changes. 29 change blocks. 
64 lines changed or deleted 69 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/