| rfc9937v1.txt | rfc9937.txt | |||
|---|---|---|---|---|
| Internet Engineering Task Force (IETF) M. Mathis | Internet Engineering Task Force (IETF) M. Mathis | |||
| Request for Comments: 9937 | Request for Comments: 9937 | |||
| Obsoletes: 6937 N. Cardwell | Obsoletes: 6937 N. Cardwell | |||
| Category: Standards Track Y. Cheng | Category: Standards Track Y. Cheng | |||
| ISSN: 2070-1721 N. Dukkipati | ISSN: 2070-1721 N. Dukkipati | |||
| Google, Inc. | Google, Inc. | |||
| November 2025 | December 2025 | |||
| Proportional Rate Reduction | Proportional Rate Reduction (PRR) | |||
| Abstract | Abstract | |||
| This document specifies a Standards Track version of the Proportional | This document specifies a Standards Track version of the Proportional | |||
| Rate Reduction (PRR) algorithm that obsoletes the Experimental | Rate Reduction (PRR) algorithm that obsoletes the Experimental | |||
| version described in RFC 6937. PRR regulates the amount of data sent | version described in RFC 6937. PRR regulates the amount of data sent | |||
| by TCP or other transport protocols during fast recovery. PRR | by TCP or other transport protocols during fast recovery. PRR | |||
| accurately regulates the actual flight size through recovery such | accurately regulates the actual flight size through recovery such | |||
| that at the end of recovery it will be as close as possible to the | that at the end of recovery it will be as close as possible to the | |||
| slow start threshold (ssthresh), as determined by the congestion | slow start threshold (ssthresh), as determined by the congestion | |||
| skipping to change at line 143 ¶ | skipping to change at line 143 ¶ | |||
| When inflight is above ssthresh, PRR reduces inflight smoothly toward | When inflight is above ssthresh, PRR reduces inflight smoothly toward | |||
| ssthresh by clocking out transmissions at a rate that is in | ssthresh by clocking out transmissions at a rate that is in | |||
| proportion to both the delivered data and ssthresh. | proportion to both the delivered data and ssthresh. | |||
| When inflight is less than ssthresh, PRR adaptively chooses between | When inflight is less than ssthresh, PRR adaptively chooses between | |||
| one of two Reduction Bounds to limit the total window reduction due | one of two Reduction Bounds to limit the total window reduction due | |||
| to all mechanisms, including transient application stalls and the | to all mechanisms, including transient application stalls and the | |||
| losses themselves. As a baseline, to be cautious when there may be | losses themselves. As a baseline, to be cautious when there may be | |||
| considerable congestion, PRR uses its Conservative Reduction Bound | considerable congestion, PRR uses its Conservative Reduction Bound | |||
| (PRR-CRB), which is strictly packet conserving. When recovery seems | (CRB), which is strictly packet conserving. When recovery seems to | |||
| to be progressing well, PRR uses its Slow Start Reduction Bound (PRR- | be progressing well, PRR uses its Slow Start Reduction Bound (SSRB), | |||
| SSRB), which is more aggressive than PRR-CRB by at most one segment | which is more aggressive than PRR-CRB by at most one segment per ACK. | |||
| per ACK. PRR-CRB meets the Strong Packet Conservation Bound | PRR-CRB meets the Strong Packet Conservation Bound described in | |||
| described in Appendix A; however, when used in real networks as the | Appendix A; however, when used in real networks as the sole approach, | |||
| sole approach, it does not perform as well as the algorithm described | it does not perform as well as the algorithm described in [RFC6675], | |||
| in [RFC6675], which proves to be more aggressive in a significant | which proves to be more aggressive in a significant number of cases. | |||
| number of cases. PRR-SSRB offers a compromise by allowing a | PRR-SSRB offers a compromise by allowing a connection to send one | |||
| connection to send one additional segment per ACK, relative to PRR- | additional segment per ACK, relative to PRR-CRB, in some situations. | |||
| CRB, in some situations. Although PRR-SSRB is less aggressive than | Although PRR-SSRB is less aggressive than [RFC6675] (transmitting | |||
| [RFC6675] (transmitting fewer segments or taking more time to | fewer segments or taking more time to transmit them), it outperforms | |||
| transmit them), it outperforms due to the lower probability of | due to the lower probability of additional losses during recovery. | |||
| additional losses during recovery. | ||||
| The original definition of the packet conservation principle | The original definition of the packet conservation principle | |||
| [Jacobson88] treated packets that are presumed to be lost (e.g., | [Jacobson88] treated packets that are presumed to be lost (e.g., | |||
| marked as candidates for retransmission) as having left the network. | marked as candidates for retransmission) as having left the network. | |||
| This idea is reflected in the inflight estimator used by PRR, but it | This idea is reflected in the inflight estimator used by PRR, but it | |||
| is distinct from the Strong Packet Conservation Bound as described in | is distinct from the Strong Packet Conservation Bound as described in | |||
| Appendix A, which is defined solely on the basis of data arriving at | Appendix A, which is defined solely on the basis of data arriving at | |||
| the receiver. | the receiver. | |||
| This document specifies several main changes from the earlier version | This document specifies several main changes from the earlier version | |||
| skipping to change at line 353 ¶ | skipping to change at line 352 ¶ | |||
| well below ssthresh, leading to bad performance. The performance | well below ssthresh, leading to bad performance. The performance | |||
| could, in some cases, be worse than [RFC6675] recovery, which simply | could, in some cases, be worse than [RFC6675] recovery, which simply | |||
| sets cwnd to ssthresh at the start of recovery. This behavior of | sets cwnd to ssthresh at the start of recovery. This behavior of | |||
| setting cwnd to ssthresh at the end of recovery has been implemented | setting cwnd to ssthresh at the end of recovery has been implemented | |||
| since the first widely deployed TCP PRR implementation in 2011 | since the first widely deployed TCP PRR implementation in 2011 | |||
| [First_TCP_PRR] and is similar to [RFC6675], which specifies setting | [First_TCP_PRR] and is similar to [RFC6675], which specifies setting | |||
| cwnd to ssthresh at the start of recovery. | cwnd to ssthresh at the start of recovery. | |||
| Since [RFC6937] was written, PRR has also been adapted to perform | Since [RFC6937] was written, PRR has also been adapted to perform | |||
| multiplicative window reduction for non-loss-based congestion control | multiplicative window reduction for non-loss-based congestion control | |||
| algorithms, such as for [RFC3168] style Explicit Congestion | algorithms, such as for Explicit Congestion Notification (ECN) as | |||
| Notification (ECN). This can be done by using some parts of the loss | described in [RFC3168]. This can be done by using some parts of the | |||
| recovery state machine (in particular, the RecoveryPoint from | loss recovery state machine (in particular, the RecoveryPoint from | |||
| [RFC6675]) to invoke the PRR ACK processing for exactly one round | [RFC6675]) to invoke the PRR ACK processing for exactly one round | |||
| trip worth of ACKs. However, note that using PRR for cwnd reductions | trip worth of ACKs. However, note that using PRR for cwnd reductions | |||
| for ECN [RFC3168] has been observed, with some approaches to Active | for ECN [RFC3168] has been observed, with some approaches to Active | |||
| Queue Management (AQM), to cause an excess cwnd reduction during ECN- | Queue Management (AQM), to cause an excess cwnd reduction during ECN- | |||
| triggered congestion episodes, as noted in [VCC]. | triggered congestion episodes, as noted in [VCC]. | |||
| 5. Relationships to Other Standards | 5. Relationships to Other Standards | |||
| PRR MAY be used in conjunction with any congestion control algorithm | PRR MAY be used in conjunction with any congestion control algorithm | |||
| that intends to make a multiplicative decrease in its sending rate | that intends to make a multiplicative decrease in its sending rate | |||
| skipping to change at line 456 ¶ | skipping to change at line 455 ¶ | |||
| 6.2. Per-ACK Steps | 6.2. Per-ACK Steps | |||
| On every ACK starting or during fast recovery, excluding the ACK that | On every ACK starting or during fast recovery, excluding the ACK that | |||
| concludes a PRR episode, PRR executes the following steps. | concludes a PRR episode, PRR executes the following steps. | |||
| First, the sender computes DeliveredData, the data sender's best | First, the sender computes DeliveredData, the data sender's best | |||
| estimate of the total number of bytes that the current ACK indicates | estimate of the total number of bytes that the current ACK indicates | |||
| have been delivered to the receiver since the previously received | have been delivered to the receiver since the previously received | |||
| ACK. With SACK, DeliveredData can be computed precisely as the | ACK. With SACK, DeliveredData can be computed precisely as the | |||
| change in SND.UNA, plus the (signed) change in SACK. Thus, in the | change in SND.UNA, plus the signed change in quantity of data marked | |||
| special case when there are no SACKed sequence ranges in the | SACKed in the scoreboard. Thus, in the special case when there are | |||
| scoreboard before or after the ACK, DeliveredData is the change in | no SACKed sequence ranges in the scoreboard before or after the ACK, | |||
| SND.UNA. In recovery without SACK, DeliveredData is estimated to be | DeliveredData is the change in SND.UNA. In recovery without SACK, | |||
| 1 SMSS on receiving a duplicate ACK, and on a subsequent partial or | DeliveredData is estimated to be 1 SMSS on each received duplicate | |||
| full ACK DeliveredData is the change in SND.UNA, minus 1 SMSS for | ACK (i.e., SND.UNA did not change). When SND.UNA advances (i.e., a | |||
| each preceding duplicate ACK. Note that without SACK, a poorly | full or partial ACK), DeliveredData is the change in SND.UNA, minus 1 | |||
| behaved receiver that returns extraneous duplicate ACKs (as described | SMSS for each preceding duplicate ACK. Note that without SACK, a | |||
| in [Savage99]) could attempt to artificially inflate DeliveredData. | poorly behaved receiver that returns extraneous duplicate ACKs (as | |||
| As a mitigation, if not using SACK, then PRR disallows incrementing | described in [Savage99]) could attempt to artificially inflate | |||
| DeliveredData when the total bytes delivered in a PRR episode would | DeliveredData. As a mitigation, if not using SACK, then PRR | |||
| exceed the estimated data outstanding upon entering recovery | disallows incrementing DeliveredData when the total bytes delivered | |||
| (RecoverFS). | in a PRR episode would exceed the estimated data outstanding upon | |||
| entering recovery (RecoverFS). | ||||
| Next, the sender computes inflight, the data sender's best estimate | Next, the sender computes inflight, the data sender's best estimate | |||
| of the number of bytes that are in flight in the network. To | of the number of bytes that are in flight in the network. To | |||
| calculate inflight, connections with SACK enabled and using loss | calculate inflight, connections with SACK enabled and using loss | |||
| detection [RFC6675] MAY use the "pipe" algorithm as specified in | detection [RFC6675] MAY use the "pipe" algorithm as specified in | |||
| [RFC6675]. SACK-enabled connections using RACK-TLP loss detection | [RFC6675]. SACK-enabled connections using RACK-TLP loss detection | |||
| [RFC8985] or other loss detection algorithms MUST calculate inflight | [RFC8985] or other loss detection algorithms MUST calculate inflight | |||
| by starting with SND.NXT - SND.UNA, subtracting out bytes SACKed in | by starting with SND.NXT - SND.UNA, subtracting out bytes SACKed in | |||
| the scoreboard, subtracting out bytes marked lost in the scoreboard, | the scoreboard, subtracting out bytes marked lost in the scoreboard, | |||
| and adding bytes in the scoreboard that have been retransmitted since | and adding bytes in the scoreboard that have been retransmitted since | |||
| skipping to change at line 650 ¶ | skipping to change at line 650 ¶ | |||
| Although the Strong Packet Conservation Bound is very appealing for a | Although the Strong Packet Conservation Bound is very appealing for a | |||
| number of reasons, earlier measurements (in Section 6 of [RFC6675]) | number of reasons, earlier measurements (in Section 6 of [RFC6675]) | |||
| demonstrate that it is less aggressive and does not perform as well | demonstrate that it is less aggressive and does not perform as well | |||
| as [RFC6675], which permits bursts of data when there are bursts of | as [RFC6675], which permits bursts of data when there are bursts of | |||
| losses. PRR-SSRB is a compromise that permits a sender to send one | losses. PRR-SSRB is a compromise that permits a sender to send one | |||
| extra segment per ACK as compared to the Packet Conserving Bound when | extra segment per ACK as compared to the Packet Conserving Bound when | |||
| the ACK indicates the recovery is in good progress without further | the ACK indicates the recovery is in good progress without further | |||
| losses. From the perspective of a strict Packet Conserving Bound, | losses. From the perspective of a strict Packet Conserving Bound, | |||
| PRR-SSRB does indeed open the window during recovery; however, it is | PRR-SSRB does indeed open the window during recovery; however, it is | |||
| significantly less aggressive than [RFC6675] in the presence of burst | significantly less aggressive than [RFC6675] in the presence of burst | |||
| losses. The [RFC6675] "half window of silence" may temporarily | losses. | |||
| reduce queue pressure when congestion control does not reduce the | ||||
| congestion window entering recovery to avoid further losses. The | ||||
| goal of PRR is to minimize the opportunities to lose the self clock | ||||
| by smoothly controlling inflight toward the target set by the | ||||
| congestion control. It is the congestion control's responsibility to | ||||
| avoid a full queue, not PRR. | ||||
| 8. Examples | 8. Examples | |||
| This section illustrates the PRR and [RFC6675] algorithm by showing | This section illustrates the PRR and [RFC6675] algorithm by showing | |||
| their different behaviors for two example scenarios: a connection | their different behaviors for two example scenarios: a connection | |||
| experiencing either a single loss or a burst of 15 consecutive | experiencing either a single loss or a burst of 15 consecutive | |||
| losses. All cases use bulk data transfers (no application pauses), | losses. All cases use bulk data transfers (no application pauses), | |||
| Reno congestion control [RFC5681], and cwnd = FlightSize = inflight = | Reno congestion control [RFC5681], and cwnd = FlightSize = inflight = | |||
| 20 segments, so ssthresh will be set to 10 at the beginning of | 20 segments, so ssthresh will be set to 10 at the beginning of | |||
| recovery. The scenarios use standard Fast Retransmit [RFC5681] and | recovery. The scenarios use standard Fast Retransmit [RFC5681] and | |||
| skipping to change at line 1081 ¶ | skipping to change at line 1075 ¶ | |||
| Monia Ghobadi and Sivasankar Radhakrishnan helped analyze the | Monia Ghobadi and Sivasankar Radhakrishnan helped analyze the | |||
| experiments. Ilpo Jarvinen reviewed the initial implementation. | experiments. Ilpo Jarvinen reviewed the initial implementation. | |||
| Mark Allman, Richard Scheffenegger, Markku Kojo, Mirja Kuehlewind, | Mark Allman, Richard Scheffenegger, Markku Kojo, Mirja Kuehlewind, | |||
| Gorry Fairhurst, Russ Housley, Paul Aitken, Daniele Ceccarelli, and | Gorry Fairhurst, Russ Housley, Paul Aitken, Daniele Ceccarelli, and | |||
| Mohamed Boucadair improved the document through their insightful | Mohamed Boucadair improved the document through their insightful | |||
| reviews and suggestions. | reviews and suggestions. | |||
| Authors' Addresses | Authors' Addresses | |||
| Matt Mathis | Matt Mathis | |||
| Email: ietf@mattmathis.net | Email: matt.mathis@gmail.com | |||
| Neal Cardwell | Neal Cardwell | |||
| Google, Inc. | Google, Inc. | |||
| Email: ncardwell@google.com | Email: ncardwell@google.com | |||
| Yuchung Cheng | Yuchung Cheng | |||
| Google, Inc. | Google, Inc. | |||
| Email: ycheng@google.com | Email: ycheng@google.com | |||
| Nandita Dukkipati | Nandita Dukkipati | |||
| End of changes. 7 change blocks. | ||||
| 39 lines changed or deleted | 33 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. | ||||