draft-ietf-quic-recovery-28.txt   draft-ietf-quic-recovery-29.txt 
QUIC J. Iyengar, Ed. QUIC J. Iyengar, Ed.
Internet-Draft Fastly Internet-Draft Fastly
Intended status: Standards Track I. Swett, Ed. Intended status: Standards Track I. Swett, Ed.
Expires: 21 November 2020 Google Expires: 12 December 2020 Google
20 May 2020 10 June 2020
QUIC Loss Detection and Congestion Control QUIC Loss Detection and Congestion Control
draft-ietf-quic-recovery-28 draft-ietf-quic-recovery-29
Abstract Abstract
This document describes loss detection and congestion control This document describes loss detection and congestion control
mechanisms for QUIC. mechanisms for QUIC.
Note to Readers Note to Readers
Discussion of this draft takes place on the QUIC working group Discussion of this draft takes place on the QUIC working group
mailing list (quic@ietf.org (mailto:quic@ietf.org)), which is mailing list (quic@ietf.org (mailto:quic@ietf.org)), which is
skipping to change at page 1, line 43 skipping to change at page 1, line 43
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 21 November 2020. This Internet-Draft will expire on 12 December 2020.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components and restrictions with respect to this document. Code Components
extracted from this document must include Simplified BSD License text extracted from this document must include Simplified BSD License text
as described in Section 4.e of the Trust Legal Provisions and are as described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Simplified BSD License. provided without warranty as described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Conventions and Definitions . . . . . . . . . . . . . . . . . 4 2. Conventions and Definitions . . . . . . . . . . . . . . . . . 4
3. Design of the QUIC Transmission Machinery . . . . . . . . . . 5 3. Design of the QUIC Transmission Machinery . . . . . . . . . . 5
3.1. Relevant Differences Between QUIC and TCP . . . . . . . . 5 4. Relevant Differences Between QUIC and TCP . . . . . . . . . . 5
3.1.1. Separate Packet Number Spaces . . . . . . . . . . . . 6 4.1. Separate Packet Number Spaces . . . . . . . . . . . . . . 6
3.1.2. Monotonically Increasing Packet Numbers . . . . . . . 6 4.2. Monotonically Increasing Packet Numbers . . . . . . . . . 6
3.1.3. Clearer Loss Epoch . . . . . . . . . . . . . . . . . 6 4.3. Clearer Loss Epoch . . . . . . . . . . . . . . . . . . . 6
3.1.4. No Reneging . . . . . . . . . . . . . . . . . . . . . 7 4.4. No Reneging . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.5. More ACK Ranges . . . . . . . . . . . . . . . . . . . 7 4.5. More ACK Ranges . . . . . . . . . . . . . . . . . . . . . 7
3.1.6. Explicit Correction For Delayed Acknowledgements . . 7 4.6. Explicit Correction For Delayed Acknowledgements . . . . 7
3.1.7. Probe Timeout Replaces RTO and TLP . . . . . . . . . 7 4.7. Probe Timeout Replaces RTO and TLP . . . . . . . . . . . 7
3.1.8. The Minimum Congestion Window is Two Packets . . . . 8 4.8. The Minimum Congestion Window is Two Packets . . . . . . 8
4. Estimating the Round-Trip Time . . . . . . . . . . . . . . . 8 5. Estimating the Round-Trip Time . . . . . . . . . . . . . . . 8
4.1. Generating RTT samples . . . . . . . . . . . . . . . . . 8 5.1. Generating RTT samples . . . . . . . . . . . . . . . . . 8
4.2. Estimating min_rtt . . . . . . . . . . . . . . . . . . . 9 5.2. Estimating min_rtt . . . . . . . . . . . . . . . . . . . 9
4.3. Estimating smoothed_rtt and rttvar . . . . . . . . . . . 9 5.3. Estimating smoothed_rtt and rttvar . . . . . . . . . . . 9
5. Loss Detection . . . . . . . . . . . . . . . . . . . . . . . 11 6. Loss Detection . . . . . . . . . . . . . . . . . . . . . . . 11
5.1. Acknowledgement-based Detection . . . . . . . . . . . . . 11 6.1. Acknowledgement-based Detection . . . . . . . . . . . . . 11
5.1.1. Packet Threshold . . . . . . . . . . . . . . . . . . 11 6.1.1. Packet Threshold . . . . . . . . . . . . . . . . . . 11
5.1.2. Time Threshold . . . . . . . . . . . . . . . . . . . 12 6.1.2. Time Threshold . . . . . . . . . . . . . . . . . . . 12
5.2. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 13 6.2. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 13
5.2.1. Computing PTO . . . . . . . . . . . . . . . . . . . . 13 6.2.1. Computing PTO . . . . . . . . . . . . . . . . . . . . 13
5.2.2. Handshakes and New Paths . . . . . . . . . . . . . . 14 6.2.2. Handshakes and New Paths . . . . . . . . . . . . . . 14
5.2.3. Speeding Up Handshake Completion . . . . . . . . . . 15 6.2.3. Speeding Up Handshake Completion . . . . . . . . . . 15
5.2.4. Sending Probe Packets . . . . . . . . . . . . . . . . 16 6.2.4. Sending Probe Packets . . . . . . . . . . . . . . . . 16
5.3. Handling Retry Packets . . . . . . . . . . . . . . . . . 17 6.3. Handling Retry Packets . . . . . . . . . . . . . . . . . 17
5.4. Discarding Keys and Packet State . . . . . . . . . . . . 17 6.4. Discarding Keys and Packet State . . . . . . . . . . . . 17
6. Congestion Control . . . . . . . . . . . . . . . . . . . . . 18 7. Congestion Control . . . . . . . . . . . . . . . . . . . . . 18
6.1. Explicit Congestion Notification . . . . . . . . . . . . 19 7.1. Explicit Congestion Notification . . . . . . . . . . . . 19
6.2. Initial and Minimum Congestion Window . . . . . . . . . . 19 7.2. Initial and Minimum Congestion Window . . . . . . . . . . 19
6.3. Slow Start . . . . . . . . . . . . . . . . . . . . . . . 19 7.3. Slow Start . . . . . . . . . . . . . . . . . . . . . . . 19
6.4. Congestion Avoidance . . . . . . . . . . . . . . . . . . 20 7.4. Congestion Avoidance . . . . . . . . . . . . . . . . . . 20
6.5. Recovery Period . . . . . . . . . . . . . . . . . . . . . 20 7.5. Recovery Period . . . . . . . . . . . . . . . . . . . . . 20
6.6. Ignoring Loss of Undecryptable Packets . . . . . . . . . 20 7.6. Ignoring Loss of Undecryptable Packets . . . . . . . . . 20
6.7. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 21 7.7. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 21
6.8. Persistent Congestion . . . . . . . . . . . . . . . . . . 21 7.8. Persistent Congestion . . . . . . . . . . . . . . . . . . 21
6.9. Pacing . . . . . . . . . . . . . . . . . . . . . . . . . 22 7.9. Pacing . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.10. Under-utilizing the Congestion Window . . . . . . . . . . 23 7.10. Under-utilizing the Congestion Window . . . . . . . . . . 23
7. Security Considerations . . . . . . . . . . . . . . . . . . . 24 8. Security Considerations . . . . . . . . . . . . . . . . . . . 24
7.1. Congestion Signals . . . . . . . . . . . . . . . . . . . 24 8.1. Congestion Signals . . . . . . . . . . . . . . . . . . . 24
7.2. Traffic Analysis . . . . . . . . . . . . . . . . . . . . 24 8.2. Traffic Analysis . . . . . . . . . . . . . . . . . . . . 24
7.3. Misreporting ECN Markings . . . . . . . . . . . . . . . . 24 8.3. Misreporting ECN Markings . . . . . . . . . . . . . . . . 24
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 25 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 25
9.1. Normative References . . . . . . . . . . . . . . . . . . 25 10.1. Normative References . . . . . . . . . . . . . . . . . . 25
9.2. Informative References . . . . . . . . . . . . . . . . . 25 10.2. Informative References . . . . . . . . . . . . . . . . . 25
Appendix A. Loss Recovery Pseudocode . . . . . . . . . . . . . . 27 Appendix A. Loss Recovery Pseudocode . . . . . . . . . . . . . . 27
A.1. Tracking Sent Packets . . . . . . . . . . . . . . . . . . 27 A.1. Tracking Sent Packets . . . . . . . . . . . . . . . . . . 27
A.1.1. Sent Packet Fields . . . . . . . . . . . . . . . . . 27 A.1.1. Sent Packet Fields . . . . . . . . . . . . . . . . . 27
A.2. Constants of Interest . . . . . . . . . . . . . . . . . . 28 A.2. Constants of Interest . . . . . . . . . . . . . . . . . . 28
A.3. Variables of interest . . . . . . . . . . . . . . . . . . 28 A.3. Variables of interest . . . . . . . . . . . . . . . . . . 28
A.4. Initialization . . . . . . . . . . . . . . . . . . . . . 29 A.4. Initialization . . . . . . . . . . . . . . . . . . . . . 29
A.5. On Sending a Packet . . . . . . . . . . . . . . . . . . . 30 A.5. On Sending a Packet . . . . . . . . . . . . . . . . . . . 29
A.6. On Receiving a Datagram . . . . . . . . . . . . . . . . . 30 A.6. On Receiving a Datagram . . . . . . . . . . . . . . . . . 30
A.7. On Receiving an Acknowledgment . . . . . . . . . . . . . 31 A.7. On Receiving an Acknowledgment . . . . . . . . . . . . . 30
A.8. Setting the Loss Detection Timer . . . . . . . . . . . . 32 A.8. Setting the Loss Detection Timer . . . . . . . . . . . . 32
A.9. On Timeout . . . . . . . . . . . . . . . . . . . . . . . 34 A.9. On Timeout . . . . . . . . . . . . . . . . . . . . . . . 33
A.10. Detecting Lost Packets . . . . . . . . . . . . . . . . . 35 A.10. Detecting Lost Packets . . . . . . . . . . . . . . . . . 34
Appendix B. Congestion Control Pseudocode . . . . . . . . . . . 35 Appendix B. Congestion Control Pseudocode . . . . . . . . . . . 35
B.1. Constants of interest . . . . . . . . . . . . . . . . . . 36 B.1. Constants of interest . . . . . . . . . . . . . . . . . . 35
B.2. Variables of interest . . . . . . . . . . . . . . . . . . 36 B.2. Variables of interest . . . . . . . . . . . . . . . . . . 36
B.3. Initialization . . . . . . . . . . . . . . . . . . . . . 37 B.3. Initialization . . . . . . . . . . . . . . . . . . . . . 36
B.4. On Packet Sent . . . . . . . . . . . . . . . . . . . . . 37 B.4. On Packet Sent . . . . . . . . . . . . . . . . . . . . . 37
B.5. On Packet Acknowledgement . . . . . . . . . . . . . . . . 37 B.5. On Packet Acknowledgement . . . . . . . . . . . . . . . . 37
B.6. On New Congestion Event . . . . . . . . . . . . . . . . . 38 B.6. On New Congestion Event . . . . . . . . . . . . . . . . . 37
B.7. Process ECN Information . . . . . . . . . . . . . . . . . 38 B.7. Process ECN Information . . . . . . . . . . . . . . . . . 38
B.8. On Packets Lost . . . . . . . . . . . . . . . . . . . . . 39 B.8. On Packets Lost . . . . . . . . . . . . . . . . . . . . . 38
B.9. Upon dropping Initial or Handshake keys . . . . . . . . . 39 B.9. Upon dropping Initial or Handshake keys . . . . . . . . . 39
Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 40 Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 39
C.1. Since draft-ietf-quic-recovery-27 . . . . . . . . . . . . 40 C.1. Since draft-ietf-quic-recovery-28 . . . . . . . . . . . . 39
C.2. Since draft-ietf-quic-recovery-26 . . . . . . . . . . . . 40 C.2. Since draft-ietf-quic-recovery-27 . . . . . . . . . . . . 39
C.3. Since draft-ietf-quic-recovery-25 . . . . . . . . . . . . 41 C.3. Since draft-ietf-quic-recovery-26 . . . . . . . . . . . . 40
C.4. Since draft-ietf-quic-recovery-24 . . . . . . . . . . . . 41 C.4. Since draft-ietf-quic-recovery-25 . . . . . . . . . . . . 40
C.5. Since draft-ietf-quic-recovery-23 . . . . . . . . . . . . 41 C.5. Since draft-ietf-quic-recovery-24 . . . . . . . . . . . . 40
C.6. Since draft-ietf-quic-recovery-22 . . . . . . . . . . . . 41 C.6. Since draft-ietf-quic-recovery-23 . . . . . . . . . . . . 40
C.7. Since draft-ietf-quic-recovery-21 . . . . . . . . . . . . 41 C.7. Since draft-ietf-quic-recovery-22 . . . . . . . . . . . . 40
C.8. Since draft-ietf-quic-recovery-20 . . . . . . . . . . . . 41 C.8. Since draft-ietf-quic-recovery-21 . . . . . . . . . . . . 40
C.9. Since draft-ietf-quic-recovery-19 . . . . . . . . . . . . 41 C.9. Since draft-ietf-quic-recovery-20 . . . . . . . . . . . . 40
C.10. Since draft-ietf-quic-recovery-18 . . . . . . . . . . . . 42 C.10. Since draft-ietf-quic-recovery-19 . . . . . . . . . . . . 41
C.11. Since draft-ietf-quic-recovery-17 . . . . . . . . . . . . 42 C.11. Since draft-ietf-quic-recovery-18 . . . . . . . . . . . . 41
C.12. Since draft-ietf-quic-recovery-16 . . . . . . . . . . . . 43 C.12. Since draft-ietf-quic-recovery-17 . . . . . . . . . . . . 42
C.13. Since draft-ietf-quic-recovery-14 . . . . . . . . . . . . 44 C.13. Since draft-ietf-quic-recovery-16 . . . . . . . . . . . . 42
C.14. Since draft-ietf-quic-recovery-13 . . . . . . . . . . . . 44 C.14. Since draft-ietf-quic-recovery-14 . . . . . . . . . . . . 43
C.15. Since draft-ietf-quic-recovery-12 . . . . . . . . . . . . 44 C.15. Since draft-ietf-quic-recovery-13 . . . . . . . . . . . . 43
C.16. Since draft-ietf-quic-recovery-11 . . . . . . . . . . . . 44 C.16. Since draft-ietf-quic-recovery-12 . . . . . . . . . . . . 43
C.17. Since draft-ietf-quic-recovery-10 . . . . . . . . . . . . 44 C.17. Since draft-ietf-quic-recovery-11 . . . . . . . . . . . . 43
C.18. Since draft-ietf-quic-recovery-09 . . . . . . . . . . . . 45 C.18. Since draft-ietf-quic-recovery-10 . . . . . . . . . . . . 43
C.19. Since draft-ietf-quic-recovery-08 . . . . . . . . . . . . 45 C.19. Since draft-ietf-quic-recovery-09 . . . . . . . . . . . . 44
C.20. Since draft-ietf-quic-recovery-07 . . . . . . . . . . . . 45 C.20. Since draft-ietf-quic-recovery-08 . . . . . . . . . . . . 44
C.21. Since draft-ietf-quic-recovery-06 . . . . . . . . . . . . 45 C.21. Since draft-ietf-quic-recovery-07 . . . . . . . . . . . . 44
C.22. Since draft-ietf-quic-recovery-05 . . . . . . . . . . . . 45 C.22. Since draft-ietf-quic-recovery-06 . . . . . . . . . . . . 44
C.23. Since draft-ietf-quic-recovery-04 . . . . . . . . . . . . 45 C.23. Since draft-ietf-quic-recovery-05 . . . . . . . . . . . . 44
C.24. Since draft-ietf-quic-recovery-03 . . . . . . . . . . . . 45 C.24. Since draft-ietf-quic-recovery-04 . . . . . . . . . . . . 44
C.25. Since draft-ietf-quic-recovery-02 . . . . . . . . . . . . 45 C.25. Since draft-ietf-quic-recovery-03 . . . . . . . . . . . . 44
C.26. Since draft-ietf-quic-recovery-01 . . . . . . . . . . . . 46 C.26. Since draft-ietf-quic-recovery-02 . . . . . . . . . . . . 44
C.27. Since draft-ietf-quic-recovery-00 . . . . . . . . . . . . 46 C.27. Since draft-ietf-quic-recovery-01 . . . . . . . . . . . . 45
C.28. Since draft-iyengar-quic-loss-recovery-01 . . . . . . . . 46 C.28. Since draft-ietf-quic-recovery-00 . . . . . . . . . . . . 45
Appendix D. Contributors . . . . . . . . . . . . . . . . . . . . 46 C.29. Since draft-iyengar-quic-loss-recovery-01 . . . . . . . . 45
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 46 Appendix D. Contributors . . . . . . . . . . . . . . . . . . . . 45
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 46 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 45
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 45
1. Introduction 1. Introduction
QUIC is a new multiplexed and secure transport protocol atop UDP, QUIC is a new multiplexed and secure transport protocol atop UDP,
specified in [QUIC-TRANSPORT]. This document describes congestion specified in [QUIC-TRANSPORT]. This document describes congestion
control and loss recovery for QUIC. Mechanisms described in this control and loss recovery for QUIC. Mechanisms described in this
document follow the spirit of existing TCP congestion control and document follow the spirit of existing TCP congestion control and
loss recovery mechanisms, described in RFCs, various Internet-drafts, loss recovery mechanisms, described in RFCs, various Internet-drafts,
or academic papers, and also those prevalent in TCP implementations. or academic papers, and also those prevalent in TCP implementations.
skipping to change at page 5, line 41 skipping to change at page 5, line 41
performance of the QUIC handshake and use shorter timers for performance of the QUIC handshake and use shorter timers for
acknowledgement. acknowledgement.
* Packets containing frames besides ACK or CONNECTION_CLOSE frames * Packets containing frames besides ACK or CONNECTION_CLOSE frames
count toward congestion control limits and are considered in- count toward congestion control limits and are considered in-
flight. flight.
* PADDING frames cause packets to contribute toward bytes in flight * PADDING frames cause packets to contribute toward bytes in flight
without directly causing an acknowledgment to be sent. without directly causing an acknowledgment to be sent.
3.1. Relevant Differences Between QUIC and TCP 4. Relevant Differences Between QUIC and TCP
Readers familiar with TCP's loss detection and congestion control Readers familiar with TCP's loss detection and congestion control
will find algorithms here that parallel well-known TCP ones. will find algorithms here that parallel well-known TCP ones.
Protocol differences between QUIC and TCP however contribute to Protocol differences between QUIC and TCP however contribute to
algorithmic differences. We briefly describe these protocol algorithmic differences. We briefly describe these protocol
differences below. differences below.
3.1.1. Separate Packet Number Spaces 4.1. Separate Packet Number Spaces
QUIC uses separate packet number spaces for each encryption level, QUIC uses separate packet number spaces for each encryption level,
except 0-RTT and all generations of 1-RTT keys use the same packet except 0-RTT and all generations of 1-RTT keys use the same packet
number space. Separate packet number spaces ensures acknowledgement number space. Separate packet number spaces ensures acknowledgement
of packets sent with one level of encryption will not cause spurious of packets sent with one level of encryption will not cause spurious
retransmission of packets sent with a different encryption level. retransmission of packets sent with a different encryption level.
Congestion control and round-trip time (RTT) measurement are unified Congestion control and round-trip time (RTT) measurement are unified
across packet number spaces. across packet number spaces.
3.1.2. Monotonically Increasing Packet Numbers 4.2. Monotonically Increasing Packet Numbers
TCP conflates transmission order at the sender with delivery order at TCP conflates transmission order at the sender with delivery order at
the receiver, which results in retransmissions of the same data the receiver, which results in retransmissions of the same data
carrying the same sequence number, and consequently leads to carrying the same sequence number, and consequently leads to
"retransmission ambiguity". QUIC separates the two. QUIC uses a "retransmission ambiguity". QUIC separates the two. QUIC uses a
packet number to indicate transmission order. Application data is packet number to indicate transmission order. Application data is
sent in one or more streams and delivery order is determined by sent in one or more streams and delivery order is determined by
stream offsets encoded within STREAM frames. stream offsets encoded within STREAM frames.
QUIC's packet number is strictly increasing within a packet number QUIC's packet number is strictly increasing within a packet number
skipping to change at page 6, line 41 skipping to change at page 6, line 41
ambiguity about which packet is acknowledged when an ACK is received. ambiguity about which packet is acknowledged when an ACK is received.
Consequently, more accurate RTT measurements can be made, spurious Consequently, more accurate RTT measurements can be made, spurious
retransmissions are trivially detected, and mechanisms such as Fast retransmissions are trivially detected, and mechanisms such as Fast
Retransmit can be applied universally, based only on packet number. Retransmit can be applied universally, based only on packet number.
This design point significantly simplifies loss detection mechanisms This design point significantly simplifies loss detection mechanisms
for QUIC. Most TCP mechanisms implicitly attempt to infer for QUIC. Most TCP mechanisms implicitly attempt to infer
transmission ordering based on TCP sequence numbers - a non-trivial transmission ordering based on TCP sequence numbers - a non-trivial
task, especially when TCP timestamps are not available. task, especially when TCP timestamps are not available.
3.1.3. Clearer Loss Epoch 4.3. Clearer Loss Epoch
QUIC starts a loss epoch when a packet is lost and ends one when any QUIC starts a loss epoch when a packet is lost and ends one when any
packet sent after the epoch starts is acknowledged. TCP waits for packet sent after the epoch starts is acknowledged. TCP waits for
the gap in the sequence number space to be filled, and so if a the gap in the sequence number space to be filled, and so if a
segment is lost multiple times in a row, the loss epoch may not end segment is lost multiple times in a row, the loss epoch may not end
for several round trips. Because both should reduce their congestion for several round trips. Because both should reduce their congestion
windows only once per epoch, QUIC will do it once for every round windows only once per epoch, QUIC will do it once for every round
trip that experiences loss, while TCP may only do it once across trip that experiences loss, while TCP may only do it once across
multiple round trips. multiple round trips.
3.1.4. No Reneging 4.4. No Reneging
QUIC ACKs contain information that is similar to TCP SACK, but QUIC QUIC ACKs contain information that is similar to TCP SACK, but QUIC
does not allow any acked packet to be reneged, greatly simplifying does not allow any acked packet to be reneged, greatly simplifying
implementations on both sides and reducing memory pressure on the implementations on both sides and reducing memory pressure on the
sender. sender.
3.1.5. More ACK Ranges 4.5. More ACK Ranges
QUIC supports many ACK ranges, opposed to TCP's 3 SACK ranges. In QUIC supports many ACK ranges, opposed to TCP's 3 SACK ranges. In
high loss environments, this speeds recovery, reduces spurious high loss environments, this speeds recovery, reduces spurious
retransmits, and ensures forward progress without relying on retransmits, and ensures forward progress without relying on
timeouts. timeouts.
3.1.6. Explicit Correction For Delayed Acknowledgements 4.6. Explicit Correction For Delayed Acknowledgements
QUIC endpoints measure the delay incurred between when a packet is QUIC endpoints measure the delay incurred between when a packet is
received and when the corresponding acknowledgment is sent, allowing received and when the corresponding acknowledgment is sent, allowing
a peer to maintain a more accurate round-trip time estimate; see a peer to maintain a more accurate round-trip time estimate; see
Section 13.2 of [QUIC-TRANSPORT]. Section 13.2 of [QUIC-TRANSPORT].
3.1.7. Probe Timeout Replaces RTO and TLP 4.7. Probe Timeout Replaces RTO and TLP
QUIC uses a probe timeout (see Section 5.2), with a timer based on QUIC uses a probe timeout (see Section 6.2), with a timer based on
TCP's RTO computation. QUIC's PTO includes the peer's maximum TCP's RTO computation. QUIC's PTO includes the peer's maximum
expected acknowledgement delay instead of using a fixed minimum expected acknowledgement delay instead of using a fixed minimum
timeout. QUIC does not collapse the congestion window until timeout. QUIC does not collapse the congestion window until
persistent congestion (Section 6.8) is declared, unlike TCP, which persistent congestion (Section 7.8) is declared, unlike TCP, which
collapses the congestion window upon expiry of an RTO. Instead of collapses the congestion window upon expiry of an RTO. Instead of
collapsing the congestion window and declaring everything in-flight collapsing the congestion window and declaring everything in-flight
lost, QUIC allows probe packets to temporarily exceed the congestion lost, QUIC allows probe packets to temporarily exceed the congestion
window whenever the timer expires. window whenever the timer expires.
In doing this, QUIC avoids unnecessary congestion window reductions, In doing this, QUIC avoids unnecessary congestion window reductions,
obviating the need for correcting mechanisms such as F-RTO [RFC5682]. obviating the need for correcting mechanisms such as F-RTO [RFC5682].
Since QUIC does not collapse the congestion window on a PTO Since QUIC does not collapse the congestion window on a PTO
expiration, a QUIC sender is not limited from sending more in-flight expiration, a QUIC sender is not limited from sending more in-flight
packets after a PTO expiration if it still has available congestion packets after a PTO expiration if it still has available congestion
window. This occurs when a sender is application-limited and the PTO window. This occurs when a sender is application-limited and the PTO
timer expires. This is more aggressive than TCP's RTO mechanism when timer expires. This is more aggressive than TCP's RTO mechanism when
application-limited, but identical when not application-limited. application-limited, but identical when not application-limited.
A single packet loss at the tail does not indicate persistent A single packet loss at the tail does not indicate persistent
congestion, so QUIC specifies a time-based definition to ensure one congestion, so QUIC specifies a time-based definition to ensure one
or more packets are sent prior to a dramatic decrease in congestion or more packets are sent prior to a dramatic decrease in congestion
window; see Section 6.8. window; see Section 7.8.
3.1.8. The Minimum Congestion Window is Two Packets 4.8. The Minimum Congestion Window is Two Packets
TCP uses a minimum congestion window of one packet. However, loss of TCP uses a minimum congestion window of one packet. However, loss of
that single packet means that the sender needs to waiting for a PTO that single packet means that the sender needs to waiting for a PTO
(Section 5.2) to recover, which can be much longer than a round-trip (Section 6.2) to recover, which can be much longer than a round-trip
time. Sending a single ack-eliciting packet also increases the time. Sending a single ack-eliciting packet also increases the
chances of incurring additional latency when a receiver delays its chances of incurring additional latency when a receiver delays its
acknowledgement. acknowledgement.
QUIC therefore recommends that the minimum congestion window be two QUIC therefore recommends that the minimum congestion window be two
packets. While this increases network load, it is considered safe, packets. While this increases network load, it is considered safe,
since the sender will still reduce its sending rate exponentially since the sender will still reduce its sending rate exponentially
under persistent congestion (Section 5.2). under persistent congestion (Section 6.2).
4. Estimating the Round-Trip Time 5. Estimating the Round-Trip Time
At a high level, an endpoint measures the time from when a packet was At a high level, an endpoint measures the time from when a packet was
sent to when it is acknowledged as a round-trip time (RTT) sample. sent to when it is acknowledged as a round-trip time (RTT) sample.
The endpoint uses RTT samples and peer-reported host delays (see The endpoint uses RTT samples and peer-reported host delays (see
Section 13.2 of [QUIC-TRANSPORT]) to generate a statistical Section 13.2 of [QUIC-TRANSPORT]) to generate a statistical
description of the network path's RTT. An endpoint computes the description of the network path's RTT. An endpoint computes the
following three values for each path: the minimum value observed over following three values for each path: the minimum value observed over
the lifetime of the path (min_rtt), an exponentially-weighted moving the lifetime of the path (min_rtt), an exponentially-weighted moving
average (smoothed_rtt), and the mean deviation (referred to as average (smoothed_rtt), and the mean deviation (referred to as
"variation" in the rest of this document) in the observed RTT samples "variation" in the rest of this document) in the observed RTT samples
(rttvar). (rttvar).
4.1. Generating RTT samples 5.1. Generating RTT samples
An endpoint generates an RTT sample on receiving an ACK frame that An endpoint generates an RTT sample on receiving an ACK frame that
meets the following two conditions: meets the following two conditions:
* the largest acknowledged packet number is newly acknowledged, and * the largest acknowledged packet number is newly acknowledged, and
* at least one of the newly acknowledged packets was ack-eliciting. * at least one of the newly acknowledged packets was ack-eliciting.
The RTT sample, latest_rtt, is generated as the time elapsed since The RTT sample, latest_rtt, is generated as the time elapsed since
the largest acknowledged packet was sent: the largest acknowledged packet was sent:
latest_rtt = ack_time - send_time_of_largest_acked latest_rtt = ack_time - send_time_of_largest_acked
An RTT sample is generated using only the largest acknowledged packet An RTT sample is generated using only the largest acknowledged packet
in the received ACK frame. This is because a peer reports ACK delays in the received ACK frame. This is because a peer reports ACK delays
for only the largest acknowledged packet in an ACK frame. While the for only the largest acknowledged packet in an ACK frame. While the
reported ACK delay is not used by the RTT sample measurement, it is reported ACK delay is not used by the RTT sample measurement, it is
used to adjust the RTT sample in subsequent computations of used to adjust the RTT sample in subsequent computations of
smoothed_rtt and rttvar Section 4.3. smoothed_rtt and rttvar Section 5.3.
To avoid generating multiple RTT samples for a single packet, an ACK To avoid generating multiple RTT samples for a single packet, an ACK
frame SHOULD NOT be used to update RTT estimates if it does not newly frame SHOULD NOT be used to update RTT estimates if it does not newly
acknowledge the largest acknowledged packet. acknowledge the largest acknowledged packet.
An RTT sample MUST NOT be generated on receiving an ACK frame that An RTT sample MUST NOT be generated on receiving an ACK frame that
does not newly acknowledge at least one ack-eliciting packet. A peer does not newly acknowledge at least one ack-eliciting packet. A peer
usually does not send an ACK frame when only non-ack-eliciting usually does not send an ACK frame when only non-ack-eliciting
packets are received. Therefore an ACK frame that contains packets are received. Therefore an ACK frame that contains
acknowledgements for only non-ack-eliciting packets could include an acknowledgements for only non-ack-eliciting packets could include an
arbitrarily large Ack Delay value. Ignoring such ACK frames avoids arbitrarily large Ack Delay value. Ignoring such ACK frames avoids
complications in subsequent smoothed_rtt and rttvar computations. complications in subsequent smoothed_rtt and rttvar computations.
A sender might generate multiple RTT samples per RTT when multiple A sender might generate multiple RTT samples per RTT when multiple
ACK frames are received within an RTT. As suggested in [RFC6298], ACK frames are received within an RTT. As suggested in [RFC6298],
doing so might result in inadequate history in smoothed_rtt and doing so might result in inadequate history in smoothed_rtt and
rttvar. Ensuring that RTT estimates retain sufficient history is an rttvar. Ensuring that RTT estimates retain sufficient history is an
open research question. open research question.
4.2. Estimating min_rtt 5.2. Estimating min_rtt
min_rtt is the minimum RTT observed for a given network path. min_rtt is the minimum RTT observed for a given network path.
min_rtt is set to the latest_rtt on the first RTT sample, and to the min_rtt is set to the latest_rtt on the first RTT sample, and to the
lesser of min_rtt and latest_rtt on subsequent samples. In this lesser of min_rtt and latest_rtt on subsequent samples. In this
document, min_rtt is used by loss detection to reject implausibly document, min_rtt is used by loss detection to reject implausibly
small rtt samples. small rtt samples.
An endpoint uses only locally observed times in computing the min_rtt An endpoint uses only locally observed times in computing the min_rtt
and does not adjust for ACK delays reported by the peer. Doing so and does not adjust for ACK delays reported by the peer. Doing so
allows the endpoint to set a lower bound for the smoothed_rtt based allows the endpoint to set a lower bound for the smoothed_rtt based
entirely on what it observes (see Section 4.3), and limits potential entirely on what it observes (see Section 5.3), and limits potential
underestimation due to erroneously-reported delays by the peer. underestimation due to erroneously-reported delays by the peer.
The RTT for a network path may change over time. If a path's actual The RTT for a network path may change over time. If a path's actual
RTT decreases, the min_rtt will adapt immediately on the first low RTT decreases, the min_rtt will adapt immediately on the first low
sample. If the path's actual RTT increases, the min_rtt will not sample. If the path's actual RTT increases, the min_rtt will not
adapt to it, allowing future RTT samples that are smaller than the adapt to it, allowing future RTT samples that are smaller than the
new RTT be included in smoothed_rtt. new RTT be included in smoothed_rtt.
4.3. Estimating smoothed_rtt and rttvar 5.3. Estimating smoothed_rtt and rttvar
smoothed_rtt is an exponentially-weighted moving average of an smoothed_rtt is an exponentially-weighted moving average of an
endpoint's RTT samples, and rttvar is the variation in the RTT endpoint's RTT samples, and rttvar is the variation in the RTT
samples, estimated using a mean variation. samples, estimated using a mean variation.
The calculation of smoothed_rtt uses path latency after adjusting RTT The calculation of smoothed_rtt uses path latency after adjusting RTT
samples for acknowledgement delays. These delays are computed using samples for acknowledgement delays. These delays are computed using
the ACK Delay field of the ACK frame as described in Section 19.3 of the ACK Delay field of the ACK frame as described in Section 19.3 of
[QUIC-TRANSPORT]. For packets sent in the ApplicationData packet [QUIC-TRANSPORT]. For packets sent in the ApplicationData packet
number space, a peer limits any delay in sending an acknowledgement number space, a peer limits any delay in sending an acknowledgement
skipping to change at page 11, line 5 skipping to change at page 11, line 5
On subsequent RTT samples, smoothed_rtt and rttvar evolve as follows: On subsequent RTT samples, smoothed_rtt and rttvar evolve as follows:
ack_delay = min(Ack Delay in ACK Frame, max_ack_delay) ack_delay = min(Ack Delay in ACK Frame, max_ack_delay)
adjusted_rtt = latest_rtt adjusted_rtt = latest_rtt
if (min_rtt + ack_delay < latest_rtt): if (min_rtt + ack_delay < latest_rtt):
adjusted_rtt = latest_rtt - ack_delay adjusted_rtt = latest_rtt - ack_delay
smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * adjusted_rtt smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * adjusted_rtt
rttvar_sample = abs(smoothed_rtt - adjusted_rtt) rttvar_sample = abs(smoothed_rtt - adjusted_rtt)
rttvar = 3/4 * rttvar + 1/4 * rttvar_sample rttvar = 3/4 * rttvar + 1/4 * rttvar_sample
5. Loss Detection 6. Loss Detection
QUIC senders use acknowledgements to detect lost packets, and a probe QUIC senders use acknowledgements to detect lost packets, and a probe
time out (see Section 5.2) to ensure acknowledgements are received. time out (see Section 6.2) to ensure acknowledgements are received.
This section provides a description of these algorithms. This section provides a description of these algorithms.
If a packet is lost, the QUIC transport needs to recover from that If a packet is lost, the QUIC transport needs to recover from that
loss, such as by retransmitting the data, sending an updated frame, loss, such as by retransmitting the data, sending an updated frame,
or abandoning the frame. For more information, see Section 13.3 of or abandoning the frame. For more information, see Section 13.3 of
[QUIC-TRANSPORT]. [QUIC-TRANSPORT].
5.1. Acknowledgement-based Detection 6.1. Acknowledgement-based Detection
Acknowledgement-based loss detection implements the spirit of TCP's Acknowledgement-based loss detection implements the spirit of TCP's
Fast Retransmit [RFC5681], Early Retransmit [RFC5827], FACK [FACK], Fast Retransmit [RFC5681], Early Retransmit [RFC5827], FACK [FACK],
SACK loss recovery [RFC6675], and RACK [RACK]. This section provides SACK loss recovery [RFC6675], and RACK [RACK]. This section provides
an overview of how these algorithms are implemented in QUIC. an overview of how these algorithms are implemented in QUIC.
A packet is declared lost if it meets all the following conditions: A packet is declared lost if it meets all the following conditions:
* The packet is unacknowledged, in-flight, and was sent prior to an * The packet is unacknowledged, in-flight, and was sent prior to an
acknowledged packet. acknowledged packet.
* Either its packet number is kPacketThreshold smaller than an * Either its packet number is kPacketThreshold smaller than an
acknowledged packet (Section 5.1.1), or it was sent long enough in acknowledged packet (Section 6.1.1), or it was sent long enough in
the past (Section 5.1.2). the past (Section 6.1.2).
The acknowledgement indicates that a packet sent later was delivered, The acknowledgement indicates that a packet sent later was delivered,
and the packet and time thresholds provide some tolerance for packet and the packet and time thresholds provide some tolerance for packet
reordering. reordering.
Spuriously declaring packets as lost leads to unnecessary Spuriously declaring packets as lost leads to unnecessary
retransmissions and may result in degraded performance due to the retransmissions and may result in degraded performance due to the
actions of the congestion controller upon detecting loss. actions of the congestion controller upon detecting loss.
Implementations can detect spurious retransmissions and increase the Implementations can detect spurious retransmissions and increase the
reordering threshold in packets or time to reduce future spurious reordering threshold in packets or time to reduce future spurious
retransmissions and loss events. Implementations with adaptive time retransmissions and loss events. Implementations with adaptive time
thresholds MAY choose to start with smaller initial reordering thresholds MAY choose to start with smaller initial reordering
thresholds to minimize recovery latency. thresholds to minimize recovery latency.
5.1.1. Packet Threshold 6.1.1. Packet Threshold
The RECOMMENDED initial value for the packet reordering threshold The RECOMMENDED initial value for the packet reordering threshold
(kPacketThreshold) is 3, based on best practices for TCP loss (kPacketThreshold) is 3, based on best practices for TCP loss
detection [RFC5681] [RFC6675]. Implementations SHOULD NOT use a detection [RFC5681] [RFC6675]. Implementations SHOULD NOT use a
packet threshold less than 3, to keep in line with TCP [RFC5681]. packet threshold less than 3, to keep in line with TCP [RFC5681].
Some networks may exhibit higher degrees of reordering, causing a Some networks may exhibit higher degrees of reordering, causing a
sender to detect spurious losses. Algorithms that increase the sender to detect spurious losses. Algorithms that increase the
reordering threshold after spuriously detecting losses, such as TCP- reordering threshold after spuriously detecting losses, such as TCP-
NCR [RFC4653], have proven to be useful in TCP and are expected to at NCR [RFC4653], have proven to be useful in TCP and are expected to at
least as useful in QUIC. Re-ordering could be more common with QUIC least as useful in QUIC. Re-ordering could be more common with QUIC
than TCP, because network elements cannot observe and fix the order than TCP, because network elements cannot observe and fix the order
of out-of-order packets. of out-of-order packets.
5.1.2. Time Threshold 6.1.2. Time Threshold
Once a later packet within the same packet number space has been Once a later packet within the same packet number space has been
acknowledged, an endpoint SHOULD declare an earlier packet lost if it acknowledged, an endpoint SHOULD declare an earlier packet lost if it
was sent a threshold amount of time in the past. To avoid declaring was sent a threshold amount of time in the past. To avoid declaring
packets as lost too early, this time threshold MUST be set to at packets as lost too early, this time threshold MUST be set to at
least the local timer granularity, as indicated by the kGranularity least the local timer granularity, as indicated by the kGranularity
constant. The time threshold is: constant. The time threshold is:
max(kTimeThreshold * max(smoothed_rtt, latest_rtt), kGranularity) max(kTimeThreshold * max(smoothed_rtt, latest_rtt), kGranularity)
skipping to change at page 13, line 5 skipping to change at page 13, line 5
The RECOMMENDED time threshold (kTimeThreshold), expressed as a The RECOMMENDED time threshold (kTimeThreshold), expressed as a
round-trip time multiplier, is 9/8. The RECOMMENDED value of the round-trip time multiplier, is 9/8. The RECOMMENDED value of the
timer granularity (kGranularity) is 1ms. timer granularity (kGranularity) is 1ms.
Implementations MAY experiment with absolute thresholds, thresholds Implementations MAY experiment with absolute thresholds, thresholds
from previous connections, adaptive thresholds, or including RTT from previous connections, adaptive thresholds, or including RTT
variation. Smaller thresholds reduce reordering resilience and variation. Smaller thresholds reduce reordering resilience and
increase spurious retransmissions, and larger thresholds increase increase spurious retransmissions, and larger thresholds increase
loss detection delay. loss detection delay.
5.2. Probe Timeout 6.2. Probe Timeout
A Probe Timeout (PTO) triggers sending one or two probe datagrams A Probe Timeout (PTO) triggers sending one or two probe datagrams
when ack-eliciting packets are not acknowledged within the expected when ack-eliciting packets are not acknowledged within the expected
period of time or the server may not have validated the client's period of time or the server may not have validated the client's
address. A PTO enables a connection to recover from loss of tail address. A PTO enables a connection to recover from loss of tail
packets or acknowledgements. packets or acknowledgements.
A PTO timer expiration event does not indicate packet loss and MUST A PTO timer expiration event does not indicate packet loss and MUST
NOT cause prior unacknowledged packets to be marked as lost. When an NOT cause prior unacknowledged packets to be marked as lost. When an
acknowledgement is received that newly acknowledges packets, loss acknowledgement is received that newly acknowledges packets, loss
detection proceeds as dictated by packet and time threshold detection proceeds as dictated by packet and time threshold
mechanisms; see Section 5.1. mechanisms; see Section 6.1.
As with loss detection, the probe timeout is per packet number space. As with loss detection, the probe timeout is per packet number space.
The PTO algorithm used in QUIC implements the reliability functions The PTO algorithm used in QUIC implements the reliability functions
of Tail Loss Probe [RACK], RTO [RFC5681], and F-RTO algorithms for of Tail Loss Probe [RACK], RTO [RFC5681], and F-RTO algorithms for
TCP [RFC5682]. The timeout computation is based on TCP's TCP [RFC5682]. The timeout computation is based on TCP's
retransmission timeout period [RFC6298]. retransmission timeout period [RFC6298].
5.2.1. Computing PTO 6.2.1. Computing PTO
When an ack-eliciting packet is transmitted, the sender schedules a When an ack-eliciting packet is transmitted, the sender schedules a
timer for the PTO period as follows: timer for the PTO period as follows:
PTO = smoothed_rtt + max(4*rttvar, kGranularity) + max_ack_delay PTO = smoothed_rtt + max(4*rttvar, kGranularity) + max_ack_delay
The PTO period is the amount of time that a sender ought to wait for The PTO period is the amount of time that a sender ought to wait for
an acknowledgement of a sent packet. This time period includes the an acknowledgement of a sent packet. This time period includes the
estimated network roundtrip-time (smoothed_rtt), the variation in the estimated network roundtrip-time (smoothed_rtt), the variation in the
estimate (4*rttvar), and max_ack_delay, to account for the maximum estimate (4*rttvar), and max_ack_delay, to account for the maximum
skipping to change at page 14, line 34 skipping to change at page 14, line 34
acknowledgements due to severe congestion. Even when there are ack- acknowledgements due to severe congestion. Even when there are ack-
eliciting packets in-flight in multiple packet number spaces, the eliciting packets in-flight in multiple packet number spaces, the
exponential increase in probe timeout occurs across all spaces to exponential increase in probe timeout occurs across all spaces to
prevent excess load on the network. For example, a timeout in the prevent excess load on the network. For example, a timeout in the
Initial packet number space doubles the length of the timeout in the Initial packet number space doubles the length of the timeout in the
Handshake packet number space. Handshake packet number space.
The life of a connection that is experiencing consecutive PTOs is The life of a connection that is experiencing consecutive PTOs is
limited by the endpoint's idle timeout. limited by the endpoint's idle timeout.
The probe timer MUST NOT be set if the time threshold Section 5.1.2 The probe timer MUST NOT be set if the time threshold Section 6.1.2
loss detection timer is set. The time threshold loss detection timer loss detection timer is set. The time threshold loss detection timer
is expected to both expire earlier than the PTO and be less likely to is expected to both expire earlier than the PTO and be less likely to
spuriously retransmit data. spuriously retransmit data.
5.2.2. Handshakes and New Paths 6.2.2. Handshakes and New Paths
Resumed connections over the same network MAY use the previous Resumed connections over the same network MAY use the previous
connection's final smoothed RTT value as the resumed connection's connection's final smoothed RTT value as the resumed connection's
initial RTT. When no previous RTT is available, the initial RTT initial RTT. When no previous RTT is available, the initial RTT
SHOULD be set to 333ms, resulting in a 1 second initial timeout, as SHOULD be set to 333ms, resulting in a 1 second initial timeout, as
recommended in [RFC6298]. recommended in [RFC6298].
A connection MAY use the delay between sending a PATH_CHALLENGE and A connection MAY use the delay between sending a PATH_CHALLENGE and
receiving a PATH_RESPONSE to set the initial RTT (see kInitialRtt in receiving a PATH_RESPONSE to set the initial RTT (see kInitialRtt in
Appendix A.2) for a new path, but the delay SHOULD NOT be considered Appendix A.2) for a new path, but the delay SHOULD NOT be considered
an RTT sample. an RTT sample.
Prior to handshake completion, when few to none RTT samples have been Prior to handshake completion, when few to none RTT samples have been
generated, it is possible that the probe timer expiration is due to generated, it is possible that the probe timer expiration is due to
an incorrect RTT estimate at the client. To allow the client to an incorrect RTT estimate at the client. To allow the client to
improve its RTT estimate, the new packet that it sends MUST be ack- improve its RTT estimate, the new packet that it sends MUST be ack-
eliciting. eliciting.
Initial packets and Handshake packets could be never acknowledged, Initial packets and Handshake packets could be never acknowledged,
but they are removed from bytes in flight when the Initial and but they are removed from bytes in flight when the Initial and
Handshake keys are discarded, as described below in Handshake keys are discarded, as described below in Section 6.4.
Section Section 5.4. When Initial or Handshake keys are discarded, When Initial or Handshake keys are discarded, the PTO and loss
the PTO and loss detection timers MUST be reset, because discarding detection timers MUST be reset, because discarding keys indicates
keys indicates forward progress and the loss detection timer might forward progress and the loss detection timer might have been set for
have been set for a now discarded packet number space. a now discarded packet number space.
5.2.2.1. Before Address Validation 6.2.2.1. Before Address Validation
Until the server has validated the client's address on the path, the Until the server has validated the client's address on the path, the
amount of data it can send is limited to three times the amount of amount of data it can send is limited to three times the amount of
data received, as specified in Section 8.1 of [QUIC-TRANSPORT]. If data received, as specified in Section 8.1 of [QUIC-TRANSPORT]. If
no additional data can be sent, the server's PTO timer MUST NOT be no additional data can be sent, the server's PTO timer MUST NOT be
armed until datagrams have been received from the client, because armed until datagrams have been received from the client, because
packets sent on PTO count against the anti-amplification limit. Note packets sent on PTO count against the anti-amplification limit. Note
that the server could fail to validate the client's address even if that the server could fail to validate the client's address even if
0-RTT is accepted. 0-RTT is accepted.
skipping to change at page 15, line 45 skipping to change at page 15, line 45
acknowledgement for one of its Handshake or 1-RTT packets, and has acknowledgement for one of its Handshake or 1-RTT packets, and has
not received a HANDSHAKE_DONE frame. If Handshake keys are available not received a HANDSHAKE_DONE frame. If Handshake keys are available
to the client, it MUST send a Handshake packet, and otherwise it MUST to the client, it MUST send a Handshake packet, and otherwise it MUST
send an Initial packet in a UDP datagram of at least 1200 bytes. send an Initial packet in a UDP datagram of at least 1200 bytes.
A client could have received and acknowledged a Handshake packet, A client could have received and acknowledged a Handshake packet,
causing it to discard state for the Initial packet number space, but causing it to discard state for the Initial packet number space, but
not sent any ack-eliciting Handshake packets. In this case, the PTO not sent any ack-eliciting Handshake packets. In this case, the PTO
is set from the current time. is set from the current time.
5.2.3. Speeding Up Handshake Completion 6.2.3. Speeding Up Handshake Completion
When a server receives an Initial packet containing duplicate CRYPTO When a server receives an Initial packet containing duplicate CRYPTO
data, it can assume the client did not receive all of the server's data, it can assume the client did not receive all of the server's
CRYPTO data sent in Initial packets, or the client's estimated RTT is CRYPTO data sent in Initial packets, or the client's estimated RTT is
too small. When a client receives Handshake or 1-RTT packets prior too small. When a client receives Handshake or 1-RTT packets prior
to obtaining Handshake keys, it may assume some or all of the to obtaining Handshake keys, it may assume some or all of the
server's Initial packets were lost. server's Initial packets were lost.
To speed up handshake completion under these conditions, an endpoint To speed up handshake completion under these conditions, an endpoint
MAY send a packet containing unacknowledged CRYPTO data earlier than MAY send a packet containing unacknowledged CRYPTO data earlier than
the PTO expiry, subject to address validation limits; see Section 8.1 the PTO expiry, subject to address validation limits; see Section 8.1
of [QUIC-TRANSPORT]. of [QUIC-TRANSPORT].
Peers can also use coalesced packets to ensure that each datagram Peers can also use coalesced packets to ensure that each datagram
elicits at least one acknowledgement. For example, clients can elicits at least one acknowledgement. For example, clients can
coalesce an Initial packet containing PING and PADDING frames with a coalesce an Initial packet containing PING and PADDING frames with a
0-RTT data packet and a server can coalesce an Initial packet 0-RTT data packet and a server can coalesce an Initial packet
containing a PING frame with one or more packets in its first flight. containing a PING frame with one or more packets in its first flight.
5.2.4. Sending Probe Packets 6.2.4. Sending Probe Packets
When a PTO timer expires, a sender MUST send at least one ack- When a PTO timer expires, a sender MUST send at least one ack-
eliciting packet in the packet number space as a probe, unless there eliciting packet in the packet number space as a probe, unless there
is no data available to send. An endpoint MAY send up to two full- is no data available to send. An endpoint MAY send up to two full-
sized datagrams containing ack-eliciting packets, to avoid an sized datagrams containing ack-eliciting packets, to avoid an
expensive consecutive PTO expiration due to a single lost datagram or expensive consecutive PTO expiration due to a single lost datagram or
transmit data from multiple packet number spaces. All probe packets transmit data from multiple packet number spaces. All probe packets
sent on a PTO MUST be ack-eliciting. sent on a PTO MUST be ack-eliciting.
In addition to sending data in the packet number space for which the In addition to sending data in the packet number space for which the
skipping to change at page 17, line 24 skipping to change at page 17, line 24
expiration increases resilience to packet drops, thus reducing the expiration increases resilience to packet drops, thus reducing the
probability of consecutive PTO events. probability of consecutive PTO events.
When the PTO timer expires multiple times and new data cannot be When the PTO timer expires multiple times and new data cannot be
sent, implementations must choose between sending the same payload sent, implementations must choose between sending the same payload
every time or sending different payloads. Sending the same payload every time or sending different payloads. Sending the same payload
may be simpler and ensures the highest priority frames arrive first. may be simpler and ensures the highest priority frames arrive first.
Sending different payloads each time reduces the chances of spurious Sending different payloads each time reduces the chances of spurious
retransmission. retransmission.
5.3. Handling Retry Packets 6.3. Handling Retry Packets
A Retry packet causes a client to send another Initial packet, A Retry packet causes a client to send another Initial packet,
effectively restarting the connection process. A Retry packet effectively restarting the connection process. A Retry packet
indicates that the Initial was received, but not processed. A Retry indicates that the Initial was received, but not processed. A Retry
packet cannot be treated as an acknowledgment, because it does not packet cannot be treated as an acknowledgment, because it does not
indicate that a packet was processed or specify the packet number. indicate that a packet was processed or specify the packet number.
Clients that receive a Retry packet reset congestion control and loss Clients that receive a Retry packet reset congestion control and loss
recovery state, including resetting any pending timers. Other recovery state, including resetting any pending timers. Other
connection state, in particular cryptographic handshake messages, is connection state, in particular cryptographic handshake messages, is
retained; see Section 17.2.5 of [QUIC-TRANSPORT]. retained; see Section 17.2.5 of [QUIC-TRANSPORT].
The client MAY compute an RTT estimate to the server as the time The client MAY compute an RTT estimate to the server as the time
period from when the first Initial was sent to when a Retry or a period from when the first Initial was sent to when a Retry or a
Version Negotiation packet is received. The client MAY use this Version Negotiation packet is received. The client MAY use this
value in place of its default for the initial RTT estimate. value in place of its default for the initial RTT estimate.
5.4. Discarding Keys and Packet State 6.4. Discarding Keys and Packet State
When packet protection keys are discarded (see Section 4.10 of When packet protection keys are discarded (see Section 4.10 of
[QUIC-TLS]), all packets that were sent with those keys can no longer [QUIC-TLS]), all packets that were sent with those keys can no longer
be acknowledged because their acknowledgements cannot be processed be acknowledged because their acknowledgements cannot be processed
anymore. The sender MUST discard all recovery state associated with anymore. The sender MUST discard all recovery state associated with
those packets and MUST remove them from the count of bytes in flight. those packets and MUST remove them from the count of bytes in flight.
Endpoints stop sending and receiving Initial packets once they start Endpoints stop sending and receiving Initial packets once they start
exchanging Handshake packets; see Section 17.2.2.1 of exchanging Handshake packets; see Section 17.2.2.1 of
[QUIC-TRANSPORT]. At this point, recovery state for all in-flight [QUIC-TRANSPORT]. At this point, recovery state for all in-flight
skipping to change at page 18, line 22 skipping to change at page 18, line 22
If a server accepts 0-RTT, but does not buffer 0-RTT packets that If a server accepts 0-RTT, but does not buffer 0-RTT packets that
arrive before Initial packets, early 0-RTT packets will be declared arrive before Initial packets, early 0-RTT packets will be declared
lost, but that is expected to be infrequent. lost, but that is expected to be infrequent.
It is expected that keys are discarded after packets encrypted with It is expected that keys are discarded after packets encrypted with
them would be acknowledged or declared lost. Initial secrets however them would be acknowledged or declared lost. Initial secrets however
might be destroyed sooner, as soon as handshake keys are available; might be destroyed sooner, as soon as handshake keys are available;
see Section 4.11.1 of [QUIC-TLS]. see Section 4.11.1 of [QUIC-TLS].
6. Congestion Control 7. Congestion Control
This document specifies a congestion controller for QUIC similar to This document specifies a congestion controller for QUIC similar to
TCP NewReno [RFC6582]. TCP NewReno [RFC6582].
The signals QUIC provides for congestion control are generic and are The signals QUIC provides for congestion control are generic and are
designed to support different algorithms. Endpoints can unilaterally designed to support different algorithms. Endpoints can unilaterally
choose a different algorithm to use, such as Cubic [RFC8312]. choose a different algorithm to use, such as Cubic [RFC8312].
If an endpoint uses a different controller than that specified in If an endpoint uses a different controller than that specified in
this document, the chosen controller MUST conform to the congestion this document, the chosen controller MUST conform to the congestion
skipping to change at page 18, line 47 skipping to change at page 18, line 47
TCP, QUIC can detect the loss of these packets and MAY use that TCP, QUIC can detect the loss of these packets and MAY use that
information to adjust the congestion controller or the rate of ACK- information to adjust the congestion controller or the rate of ACK-
only packets being sent, but this document does not describe a only packets being sent, but this document does not describe a
mechanism for doing so. mechanism for doing so.
The algorithm in this document specifies and uses the controller's The algorithm in this document specifies and uses the controller's
congestion window in bytes. congestion window in bytes.
An endpoint MUST NOT send a packet if it would cause bytes_in_flight An endpoint MUST NOT send a packet if it would cause bytes_in_flight
(see Appendix B.2) to be larger than the congestion window, unless (see Appendix B.2) to be larger than the congestion window, unless
the packet is sent on a PTO timer expiration; see Section 5.2. the packet is sent on a PTO timer expiration; see Section 6.2.
6.1. Explicit Congestion Notification 7.1. Explicit Congestion Notification
If a path has been verified to support ECN [RFC3168] [RFC8311], QUIC If a path has been verified to support ECN [RFC3168] [RFC8311], QUIC
treats a Congestion Experienced (CE) codepoint in the IP header as a treats a Congestion Experienced (CE) codepoint in the IP header as a
signal of congestion. This document specifies an endpoint's response signal of congestion. This document specifies an endpoint's response
when its peer receives packets with the ECN-CE codepoint. when its peer receives packets with the ECN-CE codepoint.
6.2. Initial and Minimum Congestion Window 7.2. Initial and Minimum Congestion Window
QUIC begins every connection in slow start with the congestion window QUIC begins every connection in slow start with the congestion window
set to an initial value. Endpoints SHOULD use an initial congestion set to an initial value. Endpoints SHOULD use an initial congestion
window of 10 times the maximum datagram size (max_datagram_size), window of 10 times the maximum datagram size (max_datagram_size),
limited to the larger of 14720 or twice the maximum datagram size. limited to the larger of 14720 or twice the maximum datagram size.
This follows the analysis and recommendations in [RFC6928], This follows the analysis and recommendations in [RFC6928],
increasing the byte limit to account for the smaller 8 byte overhead increasing the byte limit to account for the smaller 8 byte overhead
of UDP compared to the 20 byte overhead for TCP. of UDP compared to the 20 byte overhead for TCP.
Prior to validating the client's address, the server can be further Prior to validating the client's address, the server can be further
limited by the anti-amplification limit as specified in Section 8.1 limited by the anti-amplification limit as specified in Section 8.1
of [QUIC-TRANSPORT]. Though the anti-amplification limit can prevent of [QUIC-TRANSPORT]. Though the anti-amplification limit can prevent
the congestion window from being fully utilized and therefore slow the congestion window from being fully utilized and therefore slow
down the increase in congestion window, it does not directly affect down the increase in congestion window, it does not directly affect
the congestion window. the congestion window.
The minimum congestion window is the smallest value the congestion The minimum congestion window is the smallest value the congestion
window can decrease to as a response to loss, ECN-CE, or persistent window can decrease to as a response to loss, ECN-CE, or persistent
congestion. The RECOMMENDED value is 2 * max_datagram_size. congestion. The RECOMMENDED value is 2 * max_datagram_size.
6.3. Slow Start 7.3. Slow Start
While in slow start, QUIC increases the congestion window by the While in slow start, QUIC increases the congestion window by the
number of bytes acknowledged when each acknowledgment is processed, number of bytes acknowledged when each acknowledgment is processed,
resulting in exponential growth of the congestion window. resulting in exponential growth of the congestion window.
QUIC exits slow start upon loss or upon increase in the ECN-CE QUIC exits slow start upon loss or upon increase in the ECN-CE
counter. When slow start is exited, the congestion window halves and counter. When slow start is exited, the congestion window halves and
the slow start threshold is set to the new congestion window. QUIC the slow start threshold is set to the new congestion window. QUIC
re-enters slow start any time the congestion window is less than the re-enters slow start any time the congestion window is less than the
slow start threshold, which only occurs after persistent congestion slow start threshold, which only occurs after persistent congestion
is declared. is declared.
6.4. Congestion Avoidance 7.4. Congestion Avoidance
Slow start exits to congestion avoidance. Congestion avoidance uses Slow start exits to congestion avoidance. Congestion avoidance uses
an Additive Increase Multiplicative Decrease (AIMD) approach that an Additive Increase Multiplicative Decrease (AIMD) approach that
increases the congestion window by one maximum packet size per increases the congestion window by one maximum packet size per
congestion window acknowledged. When a loss or ECN-CE marking is congestion window acknowledged. When a loss or ECN-CE marking is
detected, NewReno halves the congestion window, sets the slow start detected, NewReno halves the congestion window, sets the slow start
threshold to the new congestion window, and then enters the recovery threshold to the new congestion window, and then enters the recovery
period. period.
6.5. Recovery Period 7.5. Recovery Period
A recovery period is entered when loss or ECN-CE marking of a packet A recovery period is entered when loss or ECN-CE marking of a packet
is detected in congestion avoidance after the congestion window and is detected in congestion avoidance after the congestion window and
slow start threshold have been decreased. A recovery period ends slow start threshold have been decreased. A recovery period ends
when a packet sent during the recovery period is acknowledged. This when a packet sent during the recovery period is acknowledged. This
is slightly different from TCP's definition of recovery, which ends is slightly different from TCP's definition of recovery, which ends
when the lost packet that started recovery is acknowledged. when the lost packet that started recovery is acknowledged.
The recovery period aims to limit congestion window reduction to once The recovery period aims to limit congestion window reduction to once
per round trip. Therefore during recovery, the congestion window per round trip. Therefore during recovery, the congestion window
skipping to change at page 20, line 37 skipping to change at page 20, line 37
CE counter. CE counter.
When entering recovery, a single packet MAY be sent even if bytes in When entering recovery, a single packet MAY be sent even if bytes in
flight now exceeds the recently reduced congestion window. This flight now exceeds the recently reduced congestion window. This
speeds up loss recovery if the data in the lost packet is speeds up loss recovery if the data in the lost packet is
retransmitted and is similar to TCP as described in Section 5 of retransmitted and is similar to TCP as described in Section 5 of
[RFC6675]. If further packets are lost while the sender is in [RFC6675]. If further packets are lost while the sender is in
recovery, sending any packets in response MUST obey the congestion recovery, sending any packets in response MUST obey the congestion
window limit. window limit.
6.6. Ignoring Loss of Undecryptable Packets 7.6. Ignoring Loss of Undecryptable Packets
During the handshake, some packet protection keys might not be During the handshake, some packet protection keys might not be
available when a packet arrives and the receiver can choose to drop available when a packet arrives and the receiver can choose to drop
the packet. In particular, Handshake and 0-RTT packets cannot be the packet. In particular, Handshake and 0-RTT packets cannot be
processed until the Initial packets arrive and 1-RTT packets cannot processed until the Initial packets arrive and 1-RTT packets cannot
be processed until the handshake completes. Endpoints MAY ignore the be processed until the handshake completes. Endpoints MAY ignore the
loss of Handshake, 0-RTT, and 1-RTT packets that might have arrived loss of Handshake, 0-RTT, and 1-RTT packets that might have arrived
before the peer had packet protection keys to process those packets. before the peer had packet protection keys to process those packets.
Endpoints MUST NOT ignore the loss of packets that were sent after Endpoints MUST NOT ignore the loss of packets that were sent after
the earliest acknowledged packet in a given packet number space. the earliest acknowledged packet in a given packet number space.
6.7. Probe Timeout 7.7. Probe Timeout
Probe packets MUST NOT be blocked by the congestion controller. A Probe packets MUST NOT be blocked by the congestion controller. A
sender MUST however count these packets as being additionally in sender MUST however count these packets as being additionally in
flight, since these packets add network load without establishing flight, since these packets add network load without establishing
packet loss. Note that sending probe packets might cause the packet loss. Note that sending probe packets might cause the
sender's bytes in flight to exceed the congestion window until an sender's bytes in flight to exceed the congestion window until an
acknowledgement is received that establishes loss or delivery of acknowledgement is received that establishes loss or delivery of
packets. packets.
6.8. Persistent Congestion 7.8. Persistent Congestion
When an ACK frame is received that establishes loss of all in-flight When an ACK frame is received that establishes loss of all in-flight
packets sent over a long enough period of time, the network is packets sent over a long enough period of time, the network is
considered to be experiencing persistent congestion. Commonly, this considered to be experiencing persistent congestion. Commonly, this
can be established by consecutive PTOs, but since the PTO timer is can be established by consecutive PTOs, but since the PTO timer is
reset when a new ack-eliciting packet is sent, an explicit duration reset when a new ack-eliciting packet is sent, an explicit duration
must be used to account for those cases where PTOs do not occur or must be used to account for those cases where PTOs do not occur or
are substantially delayed. The rationale for this threshold is to are substantially delayed. The rationale for this threshold is to
enable a sender to use initial PTOs for aggressive probing, as TCP enable a sender to use initial PTOs for aggressive probing, as TCP
does with Tail Loss Probe (TLP) [RACK], before establishing does with Tail Loss Probe (TLP) [RACK], before establishing
skipping to change at page 22, line 37 skipping to change at page 22, line 37
oldest and the newest packets are acknowledged, the network is oldest and the newest packets are acknowledged, the network is
considered to have experienced persistent congestion. considered to have experienced persistent congestion.
When persistent congestion is established, the sender's congestion When persistent congestion is established, the sender's congestion
window MUST be reduced to the minimum congestion window window MUST be reduced to the minimum congestion window
(kMinimumWindow). This response of collapsing the congestion window (kMinimumWindow). This response of collapsing the congestion window
on persistent congestion is functionally similar to a sender's on persistent congestion is functionally similar to a sender's
response on a Retransmission Timeout (RTO) in TCP [RFC5681] after response on a Retransmission Timeout (RTO) in TCP [RFC5681] after
Tail Loss Probes (TLP) [RACK]. Tail Loss Probes (TLP) [RACK].
6.9. Pacing 7.9. Pacing
This document does not specify a pacer, but it is RECOMMENDED that a This document does not specify a pacer, but it is RECOMMENDED that a
sender pace sending of all in-flight packets based on input from the sender pace sending of all in-flight packets based on input from the
congestion controller. For example, a pacer might distribute the congestion controller. Sending multiple packets into the network
congestion window over the smoothed RTT when used with a window-based without any delay between them creates a packet burst that might
controller, or a pacer might use the rate estimate of a rate-based cause short-term congestion and losses. Implementations MUST either
controller. use pacing or another method to limit such bursts to the initial
congestion window; see Section 7.2.
An implementation should take care to architect its congestion An implementation should take care to architect its congestion
controller to work well with a pacer. For instance, a pacer might controller to work well with a pacer. For instance, a pacer might
wrap the congestion controller and control the availability of the wrap the congestion controller and control the availability of the
congestion window, or a pacer might pace out packets handed to it by congestion window, or a pacer might pace out packets handed to it by
the congestion controller. the congestion controller.
Timely delivery of ACK frames is important for efficient loss Timely delivery of ACK frames is important for efficient loss
recovery. Packets containing only ACK frames SHOULD therefore not be recovery. Packets containing only ACK frames SHOULD therefore not be
paced, to avoid delaying their delivery to the peer. paced, to avoid delaying their delivery to the peer.
skipping to change at page 23, line 31 skipping to change at page 23, line 31
Using a value for "N" that is small, but at least 1 (for example, Using a value for "N" that is small, but at least 1 (for example,
1.25) ensures that variations in round-trip time don't result in 1.25) ensures that variations in round-trip time don't result in
under-utilization of the congestion window. Values of 'N' larger under-utilization of the congestion window. Values of 'N' larger
than 1 ultimately result in sending packets as acknowledgments are than 1 ultimately result in sending packets as acknowledgments are
received rather than when timers fire, provided the congestion window received rather than when timers fire, provided the congestion window
is fully utilized and acknowledgments arrive at regular intervals. is fully utilized and acknowledgments arrive at regular intervals.
Practical considerations, such as packetization, scheduling delays, Practical considerations, such as packetization, scheduling delays,
and computational efficiency, can cause a sender to deviate from this and computational efficiency, can cause a sender to deviate from this
rate over time periods that are much shorter than a round-trip time. rate over time periods that are much shorter than a round-trip time.
Sending multiple packets into the network without any delay between
them creates a packet burst that might cause short-term congestion
and losses. Implementations MUST either use pacing or limit such
bursts to the initial congestion window; see Section 6.2.
One possible implementation strategy for pacing uses a leaky bucket One possible implementation strategy for pacing uses a leaky bucket
algorithm, where the capacity of the "bucket" is limited to the algorithm, where the capacity of the "bucket" is limited to the
maximum burst size and the rate the "bucket" fills is determined by maximum burst size and the rate the "bucket" fills is determined by
the above function. the above function.
6.10. Under-utilizing the Congestion Window 7.10. Under-utilizing the Congestion Window
When bytes in flight is smaller than the congestion window and When bytes in flight is smaller than the congestion window and
sending is not pacing limited, the congestion window is under- sending is not pacing limited, the congestion window is under-
utilized. When this occurs, the congestion window SHOULD NOT be utilized. When this occurs, the congestion window SHOULD NOT be
increased in either slow start or congestion avoidance. This can increased in either slow start or congestion avoidance. This can
happen due to insufficient application data or flow control limits. happen due to insufficient application data or flow control limits.
A sender MAY use the pipeACK method described in Section 4.3 of A sender MAY use the pipeACK method described in Section 4.3 of
[RFC7661] to determine if the congestion window is sufficiently [RFC7661] to determine if the congestion window is sufficiently
utilized. utilized.
A sender that paces packets (see Section 6.9) might delay sending A sender that paces packets (see Section 7.9) might delay sending
packets and not fully utilize the congestion window due to this packets and not fully utilize the congestion window due to this
delay. A sender SHOULD NOT consider itself application limited if it delay. A sender SHOULD NOT consider itself application limited if it
would have fully utilized the congestion window without pacing delay. would have fully utilized the congestion window without pacing delay.
A sender MAY implement alternative mechanisms to update its A sender MAY implement alternative mechanisms to update its
congestion window after periods of under-utilization, such as those congestion window after periods of under-utilization, such as those
proposed for TCP in [RFC7661]. proposed for TCP in [RFC7661].
7. Security Considerations 8. Security Considerations
7.1. Congestion Signals 8.1. Congestion Signals
Congestion control fundamentally involves the consumption of signals Congestion control fundamentally involves the consumption of signals
- both loss and ECN codepoints - from unauthenticated entities. On- - both loss and ECN codepoints - from unauthenticated entities. On-
path attackers can spoof or alter these signals. An attacker can path attackers can spoof or alter these signals. An attacker can
cause endpoints to reduce their sending rate by dropping packets, or cause endpoints to reduce their sending rate by dropping packets, or
alter send rate by changing ECN codepoints. alter send rate by changing ECN codepoints.
7.2. Traffic Analysis 8.2. Traffic Analysis
Packets that carry only ACK frames can be heuristically identified by Packets that carry only ACK frames can be heuristically identified by
observing packet size. Acknowledgement patterns may expose observing packet size. Acknowledgement patterns may expose
information about link characteristics or application behavior. information about link characteristics or application behavior.
Endpoints can use PADDING frames or bundle acknowledgments with other Endpoints can use PADDING frames or bundle acknowledgments with other
frames to reduce leaked information. frames to reduce leaked information.
7.3. Misreporting ECN Markings 8.3. Misreporting ECN Markings
A receiver can misreport ECN markings to alter the congestion A receiver can misreport ECN markings to alter the congestion
response of a sender. Suppressing reports of ECN-CE markings could response of a sender. Suppressing reports of ECN-CE markings could
cause a sender to increase their send rate. This increase could cause a sender to increase their send rate. This increase could
result in congestion and loss. result in congestion and loss.
A sender MAY attempt to detect suppression of reports by marking A sender MAY attempt to detect suppression of reports by marking
occasional packets that they send with ECN-CE. If a packet sent with occasional packets that they send with ECN-CE. If a packet sent with
ECN-CE is not reported as having been CE marked when the packet is ECN-CE is not reported as having been CE marked when the packet is
acknowledged, then the sender SHOULD disable ECN for that path. acknowledged, then the sender SHOULD disable ECN for that path.
skipping to change at page 25, line 11 skipping to change at page 25, line 5
their sending rate, which is similar in effect to advertising reduced their sending rate, which is similar in effect to advertising reduced
connection flow control limits and so no advantage is gained by doing connection flow control limits and so no advantage is gained by doing
so. so.
Endpoints choose the congestion controller that they use. Though Endpoints choose the congestion controller that they use. Though
congestion controllers generally treat reports of ECN-CE markings as congestion controllers generally treat reports of ECN-CE markings as
equivalent to loss [RFC8311], the exact response for each controller equivalent to loss [RFC8311], the exact response for each controller
could be different. Failure to correctly respond to information could be different. Failure to correctly respond to information
about ECN markings is therefore difficult to detect. about ECN markings is therefore difficult to detect.
8. IANA Considerations 9. IANA Considerations
This document has no IANA actions. This document has no IANA actions.
9. References 10. References
9.1. Normative References 10.1. Normative References
[QUIC-TLS] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure [QUIC-TLS] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure
QUIC", Work in Progress, Internet-Draft, draft-ietf-quic- QUIC", Work in Progress, Internet-Draft, draft-ietf-quic-
tls-28, 20 May 2020, tls-29, 10 June 2020,
<https://tools.ietf.org/html/draft-ietf-quic-tls-28>. <https://tools.ietf.org/html/draft-ietf-quic-tls-29>.
[QUIC-TRANSPORT] [QUIC-TRANSPORT]
Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
Multiplexed and Secure Transport", Work in Progress, Multiplexed and Secure Transport", Work in Progress,
Internet-Draft, draft-ietf-quic-transport-28, 20 May 2020, Internet-Draft, draft-ietf-quic-transport-29, 10 June
<https://tools.ietf.org/html/draft-ietf-quic-transport- 2020, <https://tools.ietf.org/html/draft-ietf-quic-
28>. transport-29>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
March 2017, <https://www.rfc-editor.org/info/rfc8085>. March 2017, <https://www.rfc-editor.org/info/rfc8085>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
9.2. Informative References 10.2. Informative References
[FACK] Mathis, M. and J. Mahdavi, "Forward Acknowledgement: [FACK] Mathis, M. and J. Mahdavi, "Forward Acknowledgement:
Refining TCP Congestion Control", ACM SIGCOMM , August Refining TCP Congestion Control", ACM SIGCOMM , August
1996. 1996.
[RACK] Cheng, Y., Cardwell, N., Dukkipati, N., and P. Jha, "RACK: [RACK] Cheng, Y., Cardwell, N., Dukkipati, N., and P. Jha, "RACK:
a time-based fast loss detection algorithm for TCP", Work a time-based fast loss detection algorithm for TCP", Work
in Progress, Internet-Draft, draft-ietf-tcpm-rack-08, 9 in Progress, Internet-Draft, draft-ietf-tcpm-rack-08, 9
March 2020, <http://www.ietf.org/internet-drafts/draft- March 2020, <http://www.ietf.org/internet-drafts/draft-
ietf-tcpm-rack-08.txt>. ietf-tcpm-rack-08.txt>.
skipping to change at page 27, line 28 skipping to change at page 27, line 23
<https://www.rfc-editor.org/info/rfc8311>. <https://www.rfc-editor.org/info/rfc8311>.
[RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and
R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", R. Scheffenegger, "CUBIC for Fast Long-Distance Networks",
RFC 8312, DOI 10.17487/RFC8312, February 2018, RFC 8312, DOI 10.17487/RFC8312, February 2018,
<https://www.rfc-editor.org/info/rfc8312>. <https://www.rfc-editor.org/info/rfc8312>.
Appendix A. Loss Recovery Pseudocode Appendix A. Loss Recovery Pseudocode
We now describe an example implementation of the loss detection We now describe an example implementation of the loss detection
mechanisms described in Section 5. mechanisms described in Section 6.
A.1. Tracking Sent Packets A.1. Tracking Sent Packets
To correctly implement congestion control, a QUIC sender tracks every To correctly implement congestion control, a QUIC sender tracks every
ack-eliciting packet until the packet is acknowledged or lost. It is ack-eliciting packet until the packet is acknowledged or lost. It is
expected that implementations will be able to access this information expected that implementations will be able to access this information
by packet number and crypto context and store the per-packet fields by packet number and crypto context and store the per-packet fields
(Appendix A.1.1) for loss recovery and congestion control. (Appendix A.1.1) for loss recovery and congestion control.
After a packet is declared lost, the endpoint can track it for an After a packet is declared lost, the endpoint can track it for an
skipping to change at page 28, line 25 skipping to change at page 28, line 17
time_sent: The time the packet was sent. time_sent: The time the packet was sent.
A.2. Constants of Interest A.2. Constants of Interest
Constants used in loss recovery are based on a combination of RFCs, Constants used in loss recovery are based on a combination of RFCs,
papers, and common practice. papers, and common practice.
kPacketThreshold: Maximum reordering in packets before packet kPacketThreshold: Maximum reordering in packets before packet
threshold loss detection considers a packet lost. The value threshold loss detection considers a packet lost. The value
recommended in Section 5.1.1 is 3. recommended in Section 6.1.1 is 3.
kTimeThreshold: Maximum reordering in time before time threshold kTimeThreshold: Maximum reordering in time before time threshold
loss detection considers a packet lost. Specified as an RTT loss detection considers a packet lost. Specified as an RTT
multiplier. The value recommended in Section 5.1.2 is 9/8. multiplier. The value recommended in Section 6.1.2 is 9/8.
kGranularity: Timer granularity. This is a system-dependent value, kGranularity: Timer granularity. This is a system-dependent value,
and Section 5.1.2 recommends a value of 1ms. and Section 6.1.2 recommends a value of 1ms.
kInitialRtt: The RTT used before an RTT sample is taken. The value kInitialRtt: The RTT used before an RTT sample is taken. The value
recommended in Section 5.2.2 is 500ms. recommended in Section 6.2.2 is 500ms.
kPacketNumberSpace: An enum to enumerate the three packet number kPacketNumberSpace: An enum to enumerate the three packet number
spaces. spaces.
enum kPacketNumberSpace { enum kPacketNumberSpace {
Initial, Initial,
Handshake, Handshake,
ApplicationData, ApplicationData,
} }
A.3. Variables of interest A.3. Variables of interest
Variables required to implement the congestion control mechanisms are Variables required to implement the congestion control mechanisms are
described in this section. described in this section.
latest_rtt: The most recent RTT measurement made when receiving an latest_rtt: The most recent RTT measurement made when receiving an
ack for a previously unacked packet. ack for a previously unacked packet.
smoothed_rtt: The smoothed RTT of the connection, computed as smoothed_rtt: The smoothed RTT of the connection, computed as
described in Section 4.3. described in Section 5.3.
rttvar: The RTT variation, computed as described in Section 4.3. rttvar: The RTT variation, computed as described in Section 5.3.
min_rtt: The minimum RTT seen in the connection, ignoring ack delay, min_rtt: The minimum RTT seen in the connection, ignoring ack delay,
as described in Section 4.2. as described in Section 5.2.
max_ack_delay: The maximum amount of time by which the receiver max_ack_delay: The maximum amount of time by which the receiver
intends to delay acknowledgments for packets in the intends to delay acknowledgments for packets in the
ApplicationData packet number space. The actual ack_delay in a ApplicationData packet number space. The actual ack_delay in a
received ACK frame may be larger due to late timers, reordering, received ACK frame may be larger due to late timers, reordering,
or lost ACK frames. or lost ACK frames.
loss_detection_timer: Multi-modal timer used for loss detection. loss_detection_timer: Multi-modal timer used for loss detection.
pto_count: The number of times a PTO has been sent without receiving pto_count: The number of times a PTO has been sent without receiving
an ack. an ack.
time_of_last_sent_ack_eliciting_packet[kPacketNumberSpace]: The time time_of_last_ack_eliciting_packet[kPacketNumberSpace]: The time the
the most recent ack-eliciting packet was sent. most recent ack-eliciting packet was sent.
largest_acked_packet[kPacketNumberSpace]: The largest packet number largest_acked_packet[kPacketNumberSpace]: The largest packet number
acknowledged in the packet number space so far. acknowledged in the packet number space so far.
loss_time[kPacketNumberSpace]: The time at which the next packet in loss_time[kPacketNumberSpace]: The time at which the next packet in
that packet number space will be considered lost based on that packet number space will be considered lost based on
exceeding the reordering window in time. exceeding the reordering window in time.
sent_packets[kPacketNumberSpace]: An association of packet numbers sent_packets[kPacketNumberSpace]: An association of packet numbers
in a packet number space to information about them. Described in in a packet number space to information about them. Described in
skipping to change at page 30, line 14 skipping to change at page 29, line 44
loss_detection_timer.reset() loss_detection_timer.reset()
pto_count = 0 pto_count = 0
latest_rtt = 0 latest_rtt = 0
smoothed_rtt = initial_rtt smoothed_rtt = initial_rtt
rttvar = initial_rtt / 2 rttvar = initial_rtt / 2
min_rtt = 0 min_rtt = 0
max_ack_delay = 0 max_ack_delay = 0
for pn_space in [ Initial, Handshake, ApplicationData ]: for pn_space in [ Initial, Handshake, ApplicationData ]:
largest_acked_packet[pn_space] = infinite largest_acked_packet[pn_space] = infinite
time_of_last_sent_ack_eliciting_packet[pn_space] = 0 time_of_last_ack_eliciting_packet[pn_space] = 0
loss_time[pn_space] = 0 loss_time[pn_space] = 0
A.5. On Sending a Packet A.5. On Sending a Packet
After a packet is sent, information about the packet is stored. The After a packet is sent, information about the packet is stored. The
parameters to OnPacketSent are described in detail above in parameters to OnPacketSent are described in detail above in
Appendix A.1.1. Appendix A.1.1.
Pseudocode for OnPacketSent follows: Pseudocode for OnPacketSent follows:
OnPacketSent(packet_number, pn_space, ack_eliciting, OnPacketSent(packet_number, pn_space, ack_eliciting,
in_flight, sent_bytes): in_flight, sent_bytes):
sent_packets[pn_space][packet_number].packet_number = sent_packets[pn_space][packet_number].packet_number =
packet_number packet_number
sent_packets[pn_space][packet_number].time_sent = now() sent_packets[pn_space][packet_number].time_sent = now()
sent_packets[pn_space][packet_number].ack_eliciting = sent_packets[pn_space][packet_number].ack_eliciting =
ack_eliciting ack_eliciting
sent_packets[pn_space][packet_number].in_flight = in_flight sent_packets[pn_space][packet_number].in_flight = in_flight
if (in_flight): if (in_flight):
if (ack_eliciting): if (ack_eliciting):
time_of_last_sent_ack_eliciting_packet[pn_space] = now() time_of_last_ack_eliciting_packet[pn_space] = now()
OnPacketSentCC(sent_bytes) OnPacketSentCC(sent_bytes)
sent_packets[pn_space][packet_number].size = sent_bytes sent_packets[pn_space][packet_number].size = sent_bytes
SetLossDetectionTimer() SetLossDetectionTimer()
A.6. On Receiving a Datagram A.6. On Receiving a Datagram
When a server is blocked by anti-amplification limits, receiving a When a server is blocked by anti-amplification limits, receiving a
datagram unblocks it, even if none of the packets in the datagram are datagram unblocks it, even if none of the packets in the datagram are
successfully processed. In such a case, the PTO timer will need to successfully processed. In such a case, the PTO timer will need to
be re-armed. be re-armed.
skipping to change at page 32, line 43 skipping to change at page 32, line 24
which is set in the packet and timer events further below. The which is set in the packet and timer events further below. The
function SetLossDetectionTimer defined below shows how the single function SetLossDetectionTimer defined below shows how the single
timer is set. timer is set.
This algorithm may result in the timer being set in the past, This algorithm may result in the timer being set in the past,
particularly if timers wake up late. Timers set in the past fire particularly if timers wake up late. Timers set in the past fire
immediately. immediately.
Pseudocode for SetLossDetectionTimer follows: Pseudocode for SetLossDetectionTimer follows:
GetEarliestTimeAndSpace(times): GetLossTimeAndSpace():
time = times[Initial] time = loss_time[Initial]
space = Initial space = Initial
for pn_space in [ Handshake, ApplicationData ]: for pn_space in [ Handshake, ApplicationData ]:
if (times[pn_space] != 0 && if (time == 0 || loss_time[pn_space] < time):
(time == 0 || times[pn_space] < time) && time = loss_time[pn_space];
# Skip ApplicationData until handshake completion.
(pn_space != ApplicationData ||
IsHandshakeComplete()):
time = times[pn_space];
space = pn_space space = pn_space
return time, space return time, space
GetPtoTimeAndSpace():
duration = (smoothed_rtt + max(4 * rttvar, kGranularity))
* (2 ^ pto_count)
// Arm PTO from now when there are no inflight packets.
if (no in-flight packets):
assert(!PeerCompletedAddressValidation())
if (has handshake keys):
return (now() + duration), Handshake
else:
return (now() + duration), Initial
pto_timeout = infinite
pto_space = Initial
for space in [ Initial, Handshake, ApplicationData ]:
if (no in-flight packets in space):
continue;
if (space == ApplicationData):
// Skip ApplicationData until handshake complete.
if (handshake is not complete):
return pto_timeout, pto_space
// Include max_ack_delay and backoff for ApplicationData.
duration += max_ack_delay * (2 ^ pto_count)
t = time_of_last_ack_eliciting_packet[space] + duration
if (t < pto_timeout):
pto_timeout = t
pto_space = space
return pto_timeout, pto_space
PeerCompletedAddressValidation(): PeerCompletedAddressValidation():
# Assume clients validate the server's address implicitly. # Assume clients validate the server's address implicitly.
if (endpoint is server): if (endpoint is server):
return true return true
# Servers complete address validation when a # Servers complete address validation when a
# protected packet is received. # protected packet is received.
return has received Handshake ACK || return has received Handshake ACK ||
has received 1-RTT ACK || has received 1-RTT ACK ||
has received HANDSHAKE_DONE has received HANDSHAKE_DONE
SetLossDetectionTimer(): SetLossDetectionTimer():
earliest_loss_time, _ = GetEarliestTimeAndSpace(loss_time) earliest_loss_time, _ = GetLossTimeAndSpace()
if (earliest_loss_time != 0): if (earliest_loss_time != 0):
// Time threshold loss detection. // Time threshold loss detection.
loss_detection_timer.update(earliest_loss_time) loss_detection_timer.update(earliest_loss_time)
return return
if (server is at anti-amplification limit): if (server is at anti-amplification limit):
// The server's timer is not set if nothing can be sent. // The server's timer is not set if nothing can be sent.
loss_detection_timer.cancel() loss_detection_timer.cancel()
return return
if (no ack-eliciting packets in flight && if (no ack-eliciting packets in flight &&
PeerCompletedAddressValidation()): PeerCompletedAddressValidation()):
// There is nothing to detect lost, so no timer is set. // There is nothing to detect lost, so no timer is set.
// However, the client needs to arm the timer if the // However, the client needs to arm the timer if the
// server might be blocked by the anti-amplification limit. // server might be blocked by the anti-amplification limit.
loss_detection_timer.cancel() loss_detection_timer.cancel()
return return
// Determine which PN space to arm PTO for. // Determine which PN space to arm PTO for.
sent_time, pn_space = GetEarliestTimeAndSpace( timeout, _ = GetPtoTimeAndSpace()
time_of_last_sent_ack_eliciting_packet) loss_detection_timer.update(timeout)
// Don't arm PTO for ApplicationData until handshake complete.
if (pn_space == ApplicationData &&
handshake is not confirmed):
loss_detection_timer.cancel()
return
if (sent_time == 0):
assert(!PeerCompletedAddressValidation())
sent_time = now()
// Calculate PTO duration
timeout = smoothed_rtt + max(4 * rttvar, kGranularity) +
max_ack_delay
timeout = timeout * (2 ^ pto_count)
loss_detection_timer.update(sent_time + timeout)
A.9. On Timeout A.9. On Timeout
When the loss detection timer expires, the timer's mode determines When the loss detection timer expires, the timer's mode determines
the action to be performed. the action to be performed.
Pseudocode for OnLossDetectionTimeout follows: Pseudocode for OnLossDetectionTimeout follows:
OnLossDetectionTimeout(): OnLossDetectionTimeout():
earliest_loss_time, pn_space = earliest_loss_time, pn_space = GetLossTimeAndSpace()
GetEarliestTimeAndSpace(loss_time)
if (earliest_loss_time != 0): if (earliest_loss_time != 0):
// Time threshold loss Detection // Time threshold loss Detection
lost_packets = DetectLostPackets(pn_space) lost_packets = DetectLostPackets(pn_space)
assert(!lost_packets.empty()) assert(!lost_packets.empty())
OnPacketsLost(lost_packets) OnPacketsLost(lost_packets)
SetLossDetectionTimer() SetLossDetectionTimer()
return return
if (bytes_in_flight > 0): if (bytes_in_flight > 0):
// PTO. Send new data if available, else retransmit old data. // PTO. Send new data if available, else retransmit old data.
// If neither is available, send a single PING frame. // If neither is available, send a single PING frame.
_, pn_space = GetEarliestTimeAndSpace( _, pn_space = GetPtoTimeAndSpace()
time_of_last_sent_ack_eliciting_packet)
SendOneOrTwoAckElicitingPackets(pn_space) SendOneOrTwoAckElicitingPackets(pn_space)
else: else:
assert(endpoint is client without 1-RTT keys) assert(endpoint is client without 1-RTT keys)
// Client sends an anti-deadlock packet: Initial is padded // Client sends an anti-deadlock packet: Initial is padded
// to earn more anti-amplification credit, // to earn more anti-amplification credit,
// a Handshake packet proves address ownership. // a Handshake packet proves address ownership.
if (has Handshake keys): if (has Handshake keys):
SendOneAckElicitingHandshakePacket() SendOneAckElicitingHandshakePacket()
else: else:
SendOneAckElicitingPaddedInitialPacket() SendOneAckElicitingPaddedInitialPacket()
skipping to change at page 35, line 48 skipping to change at page 35, line 39
if (loss_time[pn_space] == 0): if (loss_time[pn_space] == 0):
loss_time[pn_space] = unacked.time_sent + loss_delay loss_time[pn_space] = unacked.time_sent + loss_delay
else: else:
loss_time[pn_space] = min(loss_time[pn_space], loss_time[pn_space] = min(loss_time[pn_space],
unacked.time_sent + loss_delay) unacked.time_sent + loss_delay)
return lost_packets return lost_packets
Appendix B. Congestion Control Pseudocode Appendix B. Congestion Control Pseudocode
We now describe an example implementation of the congestion We now describe an example implementation of the congestion
controller described in Section 6. controller described in Section 7.
B.1. Constants of interest B.1. Constants of interest
Constants used in congestion control are based on a combination of Constants used in congestion control are based on a combination of
RFCs, papers, and common practice. RFCs, papers, and common practice.
kInitialWindow: Default limit on the initial bytes in flight as kInitialWindow: Default limit on the initial bytes in flight as
described in Section 6.2. described in Section 7.2.
kMinimumWindow: Minimum congestion window in bytes as described in kMinimumWindow: Minimum congestion window in bytes as described in
Section 6.2. Section 7.2.
kLossReductionFactor: Reduction in congestion window when a new loss kLossReductionFactor: Reduction in congestion window when a new loss
event is detected. The Section 6 section recommends a value is event is detected. The Section 7 section recommends a value is
0.5. 0.5.
kPersistentCongestionThreshold: Period of time for persistent kPersistentCongestionThreshold: Period of time for persistent
congestion to be established, specified as a PTO multiplier. The congestion to be established, specified as a PTO multiplier. The
Section 6.8 section recommends a value of 3. Section 7.8 section recommends a value of 3.
B.2. Variables of interest B.2. Variables of interest
Variables required to implement the congestion control mechanisms are Variables required to implement the congestion control mechanisms are
described in this section. described in this section.
max_datagram_size: The sender's current maximum payload size. Does max_datagram_size: The sender's current maximum payload size. Does
not include UDP or IP overhead. The max datagram size is used for not include UDP or IP overhead. The max datagram size is used for
congestion window computations. An endpoint sets the value of congestion window computations. An endpoint sets the value of
this variable based on its PMTU (see Section 14.1 of this variable based on its PMTU (see Section 14.1 of
skipping to change at page 40, line 13 skipping to change at page 39, line 20
Pseudocode for OnPacketNumberSpaceDiscarded follows: Pseudocode for OnPacketNumberSpaceDiscarded follows:
OnPacketNumberSpaceDiscarded(pn_space): OnPacketNumberSpaceDiscarded(pn_space):
assert(pn_space != ApplicationData) assert(pn_space != ApplicationData)
// Remove any unacknowledged packets from flight. // Remove any unacknowledged packets from flight.
foreach packet in sent_packets[pn_space]: foreach packet in sent_packets[pn_space]:
if packet.in_flight if packet.in_flight
bytes_in_flight -= size bytes_in_flight -= size
sent_packets[pn_space].clear() sent_packets[pn_space].clear()
// Reset the loss detection and PTO timer // Reset the loss detection and PTO timer
time_of_last_sent_ack_eliciting_packet[kPacketNumberSpace] = 0 time_of_last_ack_eliciting_packet[pn_space] = 0
loss_time[pn_space] = 0 loss_time[pn_space] = 0
pto_count = 0 pto_count = 0
SetLossDetectionTimer() SetLossDetectionTimer()
Appendix C. Change Log Appendix C. Change Log
*RFC Editor's Note:* Please remove this section prior to *RFC Editor's Note:* Please remove this section prior to
publication of a final version of this document. publication of a final version of this document.
Issue and pull request numbers are listed with a leading octothorp. Issue and pull request numbers are listed with a leading octothorp.
C.1. Since draft-ietf-quic-recovery-27 C.1. Since draft-ietf-quic-recovery-28
* Refactored pseudocode to correct PTO calculation (#3564, #3674,
#3681)
C.2. Since draft-ietf-quic-recovery-27
* Added recommendations for speeding up handshake under some loss * Added recommendations for speeding up handshake under some loss
conditions (#3078, #3080) conditions (#3078, #3080)
* PTO count is reset when handshake progress is made (#3272, #3415) * PTO count is reset when handshake progress is made (#3272, #3415)
* PTO count is not reset by a client when the server might be * PTO count is not reset by a client when the server might be
awaiting address validation (#3546, #3551) awaiting address validation (#3546, #3551)
* Recommend repairing losses immediately after entering the recovery * Recommend repairing losses immediately after entering the recovery
period (#3335, #3443) period (#3335, #3443)
* Clarified what loss conditions can be ignored during the handshake * Clarified what loss conditions can be ignored during the handshake
(#3456, #3450) (#3456, #3450)
* Allow, but don't recommend, using RTT from previous connection to * Allow, but don't recommend, using RTT from previous connection to
seed RTT (#3464, #3496) seed RTT (#3464, #3496)
* Recommend use of adaptive loss detection thresholds (#3571, #3572) * Recommend use of adaptive loss detection thresholds (#3571, #3572)
C.2. Since draft-ietf-quic-recovery-26 C.3. Since draft-ietf-quic-recovery-26
No changes. No changes.
C.3. Since draft-ietf-quic-recovery-25 C.4. Since draft-ietf-quic-recovery-25
No significant changes. No significant changes.
C.4. Since draft-ietf-quic-recovery-24 C.5. Since draft-ietf-quic-recovery-24
* Require congestion control of some sort (#3247, #3244, #3248) * Require congestion control of some sort (#3247, #3244, #3248)
* Set a minimum reordering threshold (#3256, #3240) * Set a minimum reordering threshold (#3256, #3240)
* PTO is specific to a packet number space (#3067, #3074, #3066) * PTO is specific to a packet number space (#3067, #3074, #3066)
C.5. Since draft-ietf-quic-recovery-23 C.6. Since draft-ietf-quic-recovery-23
* Define under-utilizing the congestion window (#2630, #2686, #2675) * Define under-utilizing the congestion window (#2630, #2686, #2675)
* PTO MUST send data if possible (#3056, #3057) * PTO MUST send data if possible (#3056, #3057)
* Connection Close is not ack-eliciting (#3097, #3098) * Connection Close is not ack-eliciting (#3097, #3098)
* MUST limit bursts to the initial congestion window (#3160) * MUST limit bursts to the initial congestion window (#3160)
* Define the current max_datagram_size for congestion control * Define the current max_datagram_size for congestion control
(#3041, #3167) (#3041, #3167)
C.6. Since draft-ietf-quic-recovery-22 C.7. Since draft-ietf-quic-recovery-22
* PTO should always send an ack-eliciting packet (#2895) * PTO should always send an ack-eliciting packet (#2895)
* Unify the Handshake Timer with the PTO timer (#2648, #2658, #2886) * Unify the Handshake Timer with the PTO timer (#2648, #2658, #2886)
* Move ACK generation text to transport draft (#1860, #2916) * Move ACK generation text to transport draft (#1860, #2916)
C.7. Since draft-ietf-quic-recovery-21 C.8. Since draft-ietf-quic-recovery-21
* No changes * No changes
C.8. Since draft-ietf-quic-recovery-20 C.9. Since draft-ietf-quic-recovery-20
* Path validation can be used as initial RTT value (#2644, #2687) * Path validation can be used as initial RTT value (#2644, #2687)
* max_ack_delay transport parameter defaults to 0 (#2638, #2646) * max_ack_delay transport parameter defaults to 0 (#2638, #2646)
* Ack Delay only measures intentional delays induced by the * Ack Delay only measures intentional delays induced by the
implementation (#2596, #2786) implementation (#2596, #2786)
C.9. Since draft-ietf-quic-recovery-19 C.10. Since draft-ietf-quic-recovery-19
* Change kPersistentThreshold from an exponent to a multiplier * Change kPersistentThreshold from an exponent to a multiplier
(#2557) (#2557)
* Send a PING if the PTO timer fires and there's nothing to send * Send a PING if the PTO timer fires and there's nothing to send
(#2624) (#2624)
* Set loss delay to at least kGranularity (#2617) * Set loss delay to at least kGranularity (#2617)
* Merge application limited and sending after idle sections. Always * Merge application limited and sending after idle sections. Always
limit burst size instead of requiring resetting CWND to initial limit burst size instead of requiring resetting CWND to initial
skipping to change at page 42, line 27 skipping to change at page 41, line 36
packet is ack-eliciting but the largest_acked is not (#2592) packet is ack-eliciting but the largest_acked is not (#2592)
* Don't arm the handshake timer if there is no handshake data * Don't arm the handshake timer if there is no handshake data
(#2590) (#2590)
* Clarify that the time threshold loss alarm takes precedence over * Clarify that the time threshold loss alarm takes precedence over
the crypto handshake timer (#2590, #2620) the crypto handshake timer (#2590, #2620)
* Change initial RTT to 500ms to align with RFC6298 (#2184) * Change initial RTT to 500ms to align with RFC6298 (#2184)
C.10. Since draft-ietf-quic-recovery-18 C.11. Since draft-ietf-quic-recovery-18
* Change IW byte limit to 14720 from 14600 (#2494) * Change IW byte limit to 14720 from 14600 (#2494)
* Update PTO calculation to match RFC6298 (#2480, #2489, #2490) * Update PTO calculation to match RFC6298 (#2480, #2489, #2490)
* Improve loss detection's description of multiple packet number * Improve loss detection's description of multiple packet number
spaces and pseudocode (#2485, #2451, #2417) spaces and pseudocode (#2485, #2451, #2417)
* Declare persistent congestion even if non-probe packets are sent * Declare persistent congestion even if non-probe packets are sent
and don't make persistent congestion more aggressive than RTO and don't make persistent congestion more aggressive than RTO
verified was (#2365, #2244) verified was (#2365, #2244)
* Move pseudocode to the appendices (#2408) * Move pseudocode to the appendices (#2408)
* What to send on multiple PTOs (#2380) * What to send on multiple PTOs (#2380)
C.11. Since draft-ietf-quic-recovery-17 C.12. Since draft-ietf-quic-recovery-17
* After Probe Timeout discard in-flight packets or send another * After Probe Timeout discard in-flight packets or send another
(#2212, #1965) (#2212, #1965)
* Endpoints discard initial keys as soon as handshake keys are * Endpoints discard initial keys as soon as handshake keys are
available (#1951, #2045) available (#1951, #2045)
* 0-RTT state is discarded when 0-RTT is rejected (#2300) * 0-RTT state is discarded when 0-RTT is rejected (#2300)
* Loss detection timer is cancelled when ack-eliciting frames are in * Loss detection timer is cancelled when ack-eliciting frames are in
skipping to change at page 43, line 23 skipping to change at page 42, line 31
controller (#2138, 2187) controller (#2138, 2187)
* Process ECN counts before marking packets lost (#2142) * Process ECN counts before marking packets lost (#2142)
* Mark packets lost before resetting crypto_count and pto_count * Mark packets lost before resetting crypto_count and pto_count
(#2208, #2209) (#2208, #2209)
* Congestion and loss recovery state are discarded when keys are * Congestion and loss recovery state are discarded when keys are
discarded (#2327) discarded (#2327)
C.12. Since draft-ietf-quic-recovery-16 C.13. Since draft-ietf-quic-recovery-16
* Unify TLP and RTO into a single PTO; eliminate min RTO, min TLP * Unify TLP and RTO into a single PTO; eliminate min RTO, min TLP
and min crypto timeouts; eliminate timeout validation (#2114, and min crypto timeouts; eliminate timeout validation (#2114,
#2166, #2168, #1017) #2166, #2168, #1017)
* Redefine how congestion avoidance in terms of when the period * Redefine how congestion avoidance in terms of when the period
starts (#1928, #1930) starts (#1928, #1930)
* Document what needs to be tracked for packets that are in flight * Document what needs to be tracked for packets that are in flight
(#765, #1724, #1939) (#765, #1724, #1939)
skipping to change at page 43, line 45 skipping to change at page 43, line 4
* Integrate both time and packet thresholds into loss detection * Integrate both time and packet thresholds into loss detection
(#1969, #1212, #934, #1974) (#1969, #1212, #934, #1974)
* Reduce congestion window after idle, unless pacing is used (#2007, * Reduce congestion window after idle, unless pacing is used (#2007,
#2023) #2023)
* Disable RTT calculation for packets that don't elicit * Disable RTT calculation for packets that don't elicit
acknowledgment (#2060, #2078) acknowledgment (#2060, #2078)
* Limit ack_delay by max_ack_delay (#2060, #2099) * Limit ack_delay by max_ack_delay (#2060, #2099)
* Initial keys are discarded once Handshake keys are available * Initial keys are discarded once Handshake keys are available
(#1951, #2045) (#1951, #2045)
* Reorder ECN and loss detection in pseudocode (#2142) * Reorder ECN and loss detection in pseudocode (#2142)
* Only cancel loss detection timer if ack-eliciting packets are in * Only cancel loss detection timer if ack-eliciting packets are in
flight (#2093, #2117) flight (#2093, #2117)
C.13. Since draft-ietf-quic-recovery-14 C.14. Since draft-ietf-quic-recovery-14
* Used max_ack_delay from transport params (#1796, #1782) * Used max_ack_delay from transport params (#1796, #1782)
* Merge ACK and ACK_ECN (#1783) * Merge ACK and ACK_ECN (#1783)
C.14. Since draft-ietf-quic-recovery-13 C.15. Since draft-ietf-quic-recovery-13
* Corrected the lack of ssthresh reduction in CongestionEvent * Corrected the lack of ssthresh reduction in CongestionEvent
pseudocode (#1598) pseudocode (#1598)
* Considerations for ECN spoofing (#1426, #1626) * Considerations for ECN spoofing (#1426, #1626)
* Clarifications for PADDING and congestion control (#837, #838, * Clarifications for PADDING and congestion control (#837, #838,
#1517, #1531, #1540) #1517, #1531, #1540)
* Reduce early retransmission timer to RTT/8 (#945, #1581) * Reduce early retransmission timer to RTT/8 (#945, #1581)
* Packets are declared lost after an RTO is verified (#935, #1582) * Packets are declared lost after an RTO is verified (#935, #1582)
C.15. Since draft-ietf-quic-recovery-12 C.16. Since draft-ietf-quic-recovery-12
* Changes to manage separate packet number spaces and encryption * Changes to manage separate packet number spaces and encryption
levels (#1190, #1242, #1413, #1450) levels (#1190, #1242, #1413, #1450)
* Added ECN feedback mechanisms and handling; new ACK_ECN frame * Added ECN feedback mechanisms and handling; new ACK_ECN frame
(#804, #805, #1372) (#804, #805, #1372)
C.16. Since draft-ietf-quic-recovery-11 C.17. Since draft-ietf-quic-recovery-11
No significant changes. No significant changes.
C.17. Since draft-ietf-quic-recovery-10 C.18. Since draft-ietf-quic-recovery-10
* Improved text on ack generation (#1139, #1159) * Improved text on ack generation (#1139, #1159)
* Make references to TCP recovery mechanisms informational (#1195) * Make references to TCP recovery mechanisms informational (#1195)
* Define time_of_last_sent_handshake_packet (#1171) * Define time_of_last_sent_handshake_packet (#1171)
* Added signal from TLS the data it includes needs to be sent in a * Added signal from TLS the data it includes needs to be sent in a
Retry packet (#1061, #1199) Retry packet (#1061, #1199)
* Minimum RTT (min_rtt) is initialized with an infinite value * Minimum RTT (min_rtt) is initialized with an infinite value
(#1169) (#1169)
C.18. Since draft-ietf-quic-recovery-09 C.19. Since draft-ietf-quic-recovery-09
No significant changes. No significant changes.
C.19. Since draft-ietf-quic-recovery-08 C.20. Since draft-ietf-quic-recovery-08
* Clarified pacing and RTO (#967, #977) * Clarified pacing and RTO (#967, #977)
C.20. Since draft-ietf-quic-recovery-07 C.21. Since draft-ietf-quic-recovery-07
* Include Ack Delay in RTO(and TLP) computations (#981) * Include Ack Delay in RTO(and TLP) computations (#981)
* Ack Delay in SRTT computation (#961) * Ack Delay in SRTT computation (#961)
* Default RTT and Slow Start (#590) * Default RTT and Slow Start (#590)
* Many editorial fixes. * Many editorial fixes.
C.21. Since draft-ietf-quic-recovery-06 C.22. Since draft-ietf-quic-recovery-06
No significant changes. No significant changes.
C.22. Since draft-ietf-quic-recovery-05 C.23. Since draft-ietf-quic-recovery-05
* Add more congestion control text (#776) * Add more congestion control text (#776)
C.23. Since draft-ietf-quic-recovery-04 C.24. Since draft-ietf-quic-recovery-04
No significant changes. No significant changes.
C.24. Since draft-ietf-quic-recovery-03 C.25. Since draft-ietf-quic-recovery-03
No significant changes. No significant changes.
C.25. Since draft-ietf-quic-recovery-02 C.26. Since draft-ietf-quic-recovery-02
* Integrate F-RTO (#544, #409) * Integrate F-RTO (#544, #409)
* Add congestion control (#545, #395) * Add congestion control (#545, #395)
* Require connection abort if a skipped packet was acknowledged * Require connection abort if a skipped packet was acknowledged
(#415) (#415)
* Simplify RTO calculations (#142, #417) * Simplify RTO calculations (#142, #417)
C.26. Since draft-ietf-quic-recovery-01 C.27. Since draft-ietf-quic-recovery-01
* Overview added to loss detection * Overview added to loss detection
* Changes initial default RTT to 100ms * Changes initial default RTT to 100ms
* Added time-based loss detection and fixes early retransmit * Added time-based loss detection and fixes early retransmit
* Clarified loss recovery for handshake packets * Clarified loss recovery for handshake packets
* Fixed references and made TCP references informative * Fixed references and made TCP references informative
C.27. Since draft-ietf-quic-recovery-00 C.28. Since draft-ietf-quic-recovery-00
* Improved description of constants and ACK behavior * Improved description of constants and ACK behavior
C.28. Since draft-iyengar-quic-loss-recovery-01 C.29. Since draft-iyengar-quic-loss-recovery-01
* Adopted as base for draft-ietf-quic-recovery * Adopted as base for draft-ietf-quic-recovery
* Updated authors/editors list * Updated authors/editors list
* Added table of contents * Added table of contents
Appendix D. Contributors Appendix D. Contributors
The IETF QUIC Working Group received an enormous amount of support The IETF QUIC Working Group received an enormous amount of support
 End of changes. 128 change blocks. 
242 lines changed or deleted 251 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/