draft-ietf-ippm-tcp-throughput-tm-13.txt   rfc6349.txt 
Network Working Group B. Constantine
Internet-Draft JDSU Internet Engineering Task Force (IETF) B. Constantine
Intended status: Informational G. Forget Request for Comments: 6349 JDSU
Expires: November 30, 2011 Bell Canada (Ext. Consultant) Category: Informational G. Forget
Ruediger Geib ISSN: 2070-1721 Bell Canada (Ext. Consultant)
R. Geib
Deutsche Telekom Deutsche Telekom
Reinhard Schrage R. Schrage
Schrage Consulting Schrage Consulting
August 2011
May 31, 2011
Framework for TCP Throughput Testing Framework for TCP Throughput Testing
draft-ietf-ippm-tcp-throughput-tm-13.txt
Abstract Abstract
This framework describes a practical methodology for measuring end- This framework describes a practical methodology for measuring end-
to-end TCP Throughput in a managed IP network. The goal is to provide to-end TCP Throughput in a managed IP network. The goal is to
a better indication in regards to user experience. In this framework, provide a better indication in regard to user experience. In this
TCP and IP parameters are specified to optimize TCP throughput. framework, TCP and IP parameters are specified to optimize TCP
Throughput.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
Status of this Memo
This Internet-Draft is submitted in full conformance with the Status of This Memo
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering This document is not an Internet Standards Track specification; it is
Task Force (IETF). Note that other groups may also distribute published for informational purposes.
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Engineering Task Force
and may be updated, replaced, or obsoleted by other documents at any (IETF). It represents the consensus of the IETF community. It has
time. It is inappropriate to use Internet-Drafts as reference received public review and has been approved for publication by the
material or to cite them other than as "work in progress." Internet Engineering Steering Group (IESG). Not all documents
approved by the IESG are a candidate for any level of Internet
Standard; see Section 2 of RFC 5741.
This Internet-Draft will expire on November 30, 2011. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
http://www.rfc-editor.org/info/rfc6349.
Copyright Notice Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction ....................................................3
1.1 Terminology. . . . . . . . . . . . . . . . . . . . . . . . 4 1.1. Requirements Language ......................................4
1.2 TCP Equilibrium . . . . . . . . . . . . . . . . . . . . . 5 1.2. Terminology ................................................5
2. Scope and Goals . . . . . . . . . . . . . . . . . . . . . . . 6 1.3. TCP Equilibrium ............................................6
3. Methodology. . . . . . . . . . . . . . . . . . . . . . . . . . 7 2. Scope and Goals .................................................7
3.1 Path MTU . . . . . . . . . . . . . . . . . . . . . . . . . 9 3. Methodology .....................................................8
3.2 Round Trip Time (RTT) and Bottleneck Bandwidth (BB). . . . 9 3.1. Path MTU ..................................................10
3.2.1 Measuring RTT . . . . . . . . . . . . . . . . . . . . 9 3.2. Round-Trip Time (RTT) and Bottleneck Bandwidth (BB) .......11
3.2.2 Measuring BB . . . . . . . . . . . . . . . . . . . . 10 3.2.1. Measuring RTT ......................................11
3.3. Measuring TCP Throughput . . . . . . . . . . . . . . . . . 11 3.2.2. Measuring BB .......................................12
3.3.1 Minimum TCP RWND . . . . . . . . . . . . . . . . . . . 11 3.3. Measuring TCP Throughput ..................................12
4. TCP Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.3.1. Minimum TCP RWND ...................................13
4.1 Transfer Time Ratio. . . . . . . . . . . . . . . . . . . . 14 4. TCP Metrics ....................................................16
4.1.1 Maximum Achievable TCP Throughput calculation . . . . 15 4.1. Transfer Time Ratio .......................................16
4.1.2 Transfer Time and Transfer Time Ratio calculation. . . 16 4.1.1. Maximum Achievable TCP Throughput Calculation ......17
4.2 TCP Efficiency . . . . . . . . . . . . . . . . . . . . . . 17 4.1.2. TCP Transfer Time and Transfer Time Ratio
4.2.1 TCP Efficiency Percentage calculation . . . . . . . . 17 Calculation ........................................19
4.3 Buffer Delay . . . . . . . . . . . . . . . . . . . . . . . 17 4.2. TCP Efficiency ............................................20
4.3.1 Buffer Delay Percentage calculation. . . . . . . . . . 17 4.2.1. TCP Efficiency Percentage Calculation ..............20
5. Conducting TCP Throughput Tests. . . . . . . . . . . . . . . . 18 4.3. Buffer Delay ..............................................20
5.1 Single versus Multiple Connections . . . . . . . . . . . . 18 4.3.1. Buffer Delay Percentage Calculation ................21
5.2 Results Interpretation . . . . . . . . . . . . . . . . . . 19 5. Conducting TCP Throughput Tests ................................21
6. Security Considerations . . . . . . . . . . . . . . . . . . . 21 5.1. Single versus Multiple TCP Connections ....................21
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 5.2. Results Interpretation ....................................22
8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 22 6. Security Considerations ........................................25
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 6.1. Denial-of-Service Attacks .................................25
9.1 Normative References . . . . . . . . . . . . . . . . . . . 22 6.2. User Data Confidentiality .................................25
9.2 Informative References . . . . . . . . . . . . . . . . . . 22 6.3. Interference with Metrics .................................25
7. Acknowledgments ................................................26
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23 8. Normative References ...........................................26
1. Introduction 1. Introduction
In the network industry, the SLA (Service Level Agreement) provided In the network industry, the SLA (Service Level Agreement) provided
to business class customers is generally based upon Layer 2/3 to business-class customers is generally based upon Layer 2/3
criteria such as: Bandwidth, latency, packet loss and delay criteria such as bandwidth, latency, packet loss, and delay
variations (jitter). Network providers are coming to the realization variations (jitter). Network providers are coming to the realization
that Layer 2/3 testing is not enough to adequately ensure end-user's that Layer 2/3 testing is not enough to adequately ensure end-users'
satisfaction. In addition to Layer 2/3 testing, this framework satisfaction. In addition to Layer 2/3 testing, this framework
recommends a methodology for measuring TCP Throughput in order to recommends a methodology for measuring TCP Throughput in order to
provide meaningful results with respect to user experience. provide meaningful results with respect to user experience.
Additionally, business class customers seek to conduct repeatable TCP Additionally, business-class customers seek to conduct repeatable TCP
Throughput tests between locations. Since these organizations rely on Throughput tests between locations. Since these organizations rely
the networks of the providers, a common test methodology with on the networks of the providers, a common test methodology with
predefined metrics would benefit both parties. predefined metrics would benefit both parties.
Note that the primary focus of this methodology is managed business Note that the primary focus of this methodology is managed business-
class IP networks; e.g. those Ethernet terminated services for which class IP networks, e.g., those Ethernet-terminated services for which
organizations are provided an SLA from the network provider. Because organizations are provided an SLA from the network provider. Because
of the SLA, the expectation is that the TCP Throughput should achieve of the SLA, the expectation is that the TCP Throughput should achieve
the guaranteed bandwidth. End-users with "best effort" access could the guaranteed bandwidth. End-users with "best effort" access could
use this methodology, but this framework and its metrics are intended use this methodology, but this framework and its metrics are intended
to be used in a predictable managed IP network. No end-to-end to be used in a predictable managed IP network. No end-to-end
performance can be guaranteed when only the access portion is being performance can be guaranteed when only the access portion is being
provisioned to a specific bandwidth capacity. provisioned to a specific bandwidth capacity.
The intent behind this document is to define a methodology for The intent behind this document is to define a methodology for
testing sustained TCP Layer performance. In this document, the testing sustained TCP Layer performance. In this document, the
achievable TCP Throughput is that amount of data per unit time that achievable TCP Throughput is that amount of data per unit of time
TCP transports when in the TCP Equilibrium state. (See Section 1.2 that TCP transports when in the TCP Equilibrium state. (See
for TCP Equilibrium definition). Throughout this document, maximum Section 1.3 for the TCP Equilibrium definition). Throughout this
achievable throughput refers to the theoretical achievable throughput document, "maximum achievable throughput" refers to the theoretical
when TCP is in the Equilibrium state. achievable throughput when TCP is in the Equilibrium state.
TCP is connection oriented and at the transmitting side it uses a TCP is connection oriented, and at the transmitting side, it uses a
congestion window, (TCP CWND). At the receiving end, TCP uses a congestion window (TCP CWND). At the receiving end, TCP uses a
receive window, (TCP RWND) to inform the transmitting end on how receive window (TCP RWND) to inform the transmitting end on how many
many Bytes it is capable to accept at a given time. Bytes it is capable of accepting at a given time.
Derived from Round Trip Time (RTT) and network Bottleneck Bandwidth Derived from Round-Trip Time (RTT) and network Bottleneck Bandwidth
(BB), the Bandwidth Delay Product (BDP) determines the Send and (BB), the Bandwidth-Delay Product (BDP) determines the Send and
Received Socket buffers sizes required to achieve the maximum TCP Received Socket buffer sizes required to achieve the maximum TCP
Throughput. Then, with the help of slow start and congestion Throughput. Then, with the help of slow start and congestion
avoidance algorithms, a TCP CWND is calculated based on the IP avoidance algorithms, a TCP CWND is calculated based on the IP
network path loss rate. Finally, the minimum value between the network path loss rate. Finally, the minimum value between the
calculated TCP CWND and the TCP RWND advertised by the opposite end calculated TCP CWND and the TCP RWND advertised by the opposite end
will determine how many Bytes can actually be sent by the will determine how many Bytes can actually be sent by the
transmitting side at a given time. transmitting side at a given time.
Both TCP Window sizes (RWND and CWND) may vary during any given TCP Both TCP Window sizes (RWND and CWND) may vary during any given TCP
session, although up to bandwidth limits, larger RWND and larger CWND session, although up to bandwidth limits, larger RWND and larger CWND
will achieve higher throughputs by permitting more in-flight Bytes. will achieve higher throughputs by permitting more in-flight Bytes.
At both ends of the TCP connection and for each socket, there are At both ends of the TCP connection and for each socket, there are
default buffer sizes. There are also kernel enforced maximum buffer default buffer sizes. There are also kernel-enforced maximum buffer
sizes. These buffer sizes can be adjusted at both ends (transmitting sizes. These buffer sizes can be adjusted at both ends (transmitting
and receiving). Some TCP/IP stack implementations use Receive Window and receiving). Some TCP/IP stack implementations use Receive Window
Auto-Tuning, although in order to obtain the maximum throughput it is Auto-Tuning, although, in order to obtain the maximum throughput, it
critical to use large enough TCP Send and Receive Socket Buffer is critical to use large enough TCP Send and Receive Socket Buffer
sizes. In fact, they SHOULD be equal to or greater than BDP. sizes. In fact, they SHOULD be equal to or greater than BDP.
Many variables are involved in TCP Throughput performance, but this Many variables are involved in TCP Throughput performance, but this
methodology focuses on: methodology focuses on the following:
- BB (Bottleneck Bandwidth) - BB (Bottleneck Bandwidth)
- RTT (Round Trip Time)
- RTT (Round-Trip Time)
- Send and Receive Socket Buffers - Send and Receive Socket Buffers
- Minimum TCP RWND - Minimum TCP RWND
- Path MTU (Maximum Transmission Unit) - Path MTU (Maximum Transmission Unit)
This methodology proposes TCP testing that SHOULD be performed in This methodology proposes TCP testing that SHOULD be performed in
addition to traditional Layer 2/3 type tests. In fact, Layer 2/3 addition to traditional tests of the Layer 2/3 type. In fact, Layer
tests are REQUIRED to verify the integrity of the network before 2/3 tests are REQUIRED to verify the integrity of the network before
conducting TCP tests. Examples include iperf (UDP mode) and manual conducting TCP tests. Examples include "iperf" (UDP mode) and manual
packet layer test techniques where packet throughput, loss, and delay packet-layer test techniques where packet throughput, loss, and delay
measurements are conducted. When available, standardized testing measurements are conducted. When available, standardized testing
similar to [RFC2544] but adapted for use in operational networks MAY similar to [RFC2544], but adapted for use in operational networks,
be used. MAY be used.
Note: [RFC2544] was never meant to be used outside a lab environment. Note: [RFC2544] was never meant to be used outside a lab environment.
Sections 2 and 3 of this document provides a general overview of the Sections 2 and 3 of this document provide a general overview of the
proposed methodology. Section 4 defines the metrics while Section 5 proposed methodology. Section 4 defines the metrics, while Section 5
explains how to conduct the tests and interpret the results. explains how to conduct the tests and interpret the results.
1.1 Terminology 1.1. Requirements Language
The common definitions used in this methodology are: The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
- TCP Throughput Test Device (TCP TTD), refers to compliant TCP 1.2. Terminology
host that generates traffic and measures metrics as defined in
this methodology. i.e. a dedicated communications test instrument. The common definitions used in this methodology are as follows:
- Customer Provided Equipment (CPE), refers to customer owned
equipment (routers, switches, computers, etc.) - TCP Throughput Test Device (TCP TTD) refers to a compliant TCP host
- Customer Edge (CE), refers to provider owned demarcation device. that generates traffic and measures metrics as defined in this
- Provider Edge (PE), refers to provider's distribution equipment. methodology, i.e., a dedicated communications test instrument.
- Bottleneck Bandwidth (BB), lowest bandwidth along the complete
path. Bottleneck Bandwidth and Bandwidth are used synonymously - Customer Provided Equipment (CPE) refers to customer-owned
in this document. Most of the time the Bottleneck Bandwidth is equipment (routers, switches, computers, etc.).
in the access portion of the wide area network (CE - PE).
- Provider (P), refers to provider core network equipment. - Customer Edge (CE) refers to a provider-owned demarcation device.
- Network Under Test (NUT), refers to the tested IP network path.
- Round Trip Time (RTT), is the elapsed time between the clocking in - Provider Edge (PE) refers to a provider's distribution equipment.
- Bottleneck Bandwidth (BB) refers to the lowest bandwidth along the
complete path. "Bottleneck Bandwidth" and "Bandwidth" are used
synonymously in this document. Most of the time, the Bottleneck
Bandwidth is in the access portion of the wide-area network
(CE - PE).
- Provider (P) refers to provider core network equipment.
- Network Under Test (NUT) refers to the tested IP network path.
- Round-Trip Time (RTT) is the elapsed time between the clocking in
of the first bit of a TCP segment sent and the receipt of the last of the first bit of a TCP segment sent and the receipt of the last
bit of the corresponding TCP Acknowledgment. bit of the corresponding TCP Acknowledgment.
- Bandwidth Delay Product (BDP), refers to the product of a data
link's capacity (in bits per second) and its end-to-end delay
(in seconds).
Figure 1.1 Devices, Links and Paths - Bandwidth-Delay Product (BDP) refers to the product of a data
link's capacity (in bits per second) and its end-to-end delay (in
seconds).
+----+ +----+ +----+ +----+ +---+ +---+ +----+ +----+ +----+ +----+ +---+ +----+ +----+ +----+ +---+ +---+ +----+ +----+ +----+ +---+
| TCP|-| CPE|-| CE |--| PE |-| P |--| P |-| PE |--| CE |-| CPE|-| TCP| |TCP|-| CPE|-| CE |--| PE |-| P |--| P |-| PE |--| CE |-| CPE|-|TCP|
| TTD| | | | |BB| | | | | | | |BB| | | | | TTD| |TTD| | | | |BB| | | | | | | |BB| | | | |TTD|
+----+ +----+ +----+ +----+ +---+ +---+ +----+ +----+ +----+ +----+ +---+ +----+ +----+ +----+ +---+ +---+ +----+ +----+ +----+ +---+
<------------------------ NUT -------------------------> <------------------------ NUT ------------------------->
R >-----------------------------------------------------------| R >-----------------------------------------------------------|
T | T |
T <-----------------------------------------------------------| T <-----------------------------------------------------------|
Note that the NUT may be built with of a variety of devices including Figure 1.2. Devices, Links, and Paths
but not limited to, load balancers, proxy servers or WAN acceleration
appliances. The detailed topology of the NUT SHOULD be well known
when conducting the TCP Throughput tests, although this methodology
makes no attempt to characterize specific network architectures.
1.2 TCP Equilibrium Note that the NUT may be built with a variety of devices including,
but not limited to, load balancers, proxy servers, or WAN
acceleration appliances. The detailed topology of the NUT SHOULD be
well-known when conducting the TCP Throughput tests, although this
methodology makes no attempt to characterize specific network
architectures.
1.3. TCP Equilibrium
TCP connections have three (3) fundamental congestion window phases, TCP connections have three (3) fundamental congestion window phases,
which are depicted in Figure 1.2. which are depicted in Figure 1.3.
1 - The Slow Start phase, which occurs at the beginning of a TCP 1. The Slow Start phase, which occurs at the beginning of a TCP
transmission or after a retransmission time out. transmission or after a retransmission Time-Out.
2 - The Congestion Avoidance phase, during which TCP ramps up to 2. The Congestion Avoidance phase, during which TCP ramps up to
establish the maximum achievable throughput. It is important to note establish the maximum achievable throughput. It is important to
that retransmissions are a natural by-product of the TCP congestion note that retransmissions are a natural by-product of the TCP
avoidance algorithm as it seeks to achieve maximum throughput. congestion avoidance algorithm as it seeks to achieve maximum
throughput.
3 - The Loss Recovery phase, which could include Fast Retransmit 3. The Loss Recovery phase, which could include Fast Retransmit
(Tahoe) or Fast Recovery (Reno & New Reno). When packet loss occurs, (Tahoe) or Fast Recovery (Reno and New Reno). When packet loss
Congestion Avoidance phase transitions either to Fast Retransmission occurs, the Congestion Avoidance phase transitions either to Fast
or Fast Recovery depending upon the TCP implementation. If a Time-Out Retransmission or Fast Recovery, depending upon the TCP
occurs, TCP transitions back to the Slow Start phase. implementation. If a Time-Out occurs, TCP transitions back to the
Slow Start phase.
Figure 1.2 TCP CWND Phases /\ |
/\ |High ssthresh TCP CWND TCP
/\ |Loss Event * halving 3-Loss Recovery Equilibrium
T | * \ upon loss
h | * \ / \ Time-Out Adjusted
r | * \ / \ +--------+ * ssthresh
T o | * \/ \ / Multiple| *
C u | * 2-Congestion\ / Loss | *
P g | * Avoidance \/ Event | *
h | * Half | *
p | * TCP CWND | * 1-Slow Start
u | * 1-Slow Start Min TCP CWND after T-O
t +-----------------------------------------------------------
Time > > > > > > > > > > > > > > > > > > > > > > > > > >
/\ | Note: ssthresh = Slow Start threshold.
/\ |High ssthresh TCP CWND TCP
/\ |Loss Event * halving 3-Loss Recovery Equilibrium
/\ | * \ upon loss
/\ | * \ / \ Time-Out Adjusted
/\ | * \ / \ +--------+ * ssthresh
/\ | * \/ \ / Multiple| *
/\ | * 2-Congestion\ / Loss | *
/\ | * Avoidance \/ Event | *
TCP | * Half | *
Through- | * TCP CWND | * 1-Slow Start
put | * 1-Slow Start Min TCP CWND after T-O
+-----------------------------------------------------------
Time > > > > > > > > > > > > > > > > > > > > > > > > > > >
Note: ssthresh = Slow Start threshold. Figure 1.3. TCP CWND Phases
A well tuned and managed IP network with appropriate TCP adjustments A well-tuned and well-managed IP network with appropriate TCP
in the IP hosts and applications should perform very close to the adjustments in the IP hosts and applications should perform very
BB when TCP is in the Equilibrium state. close to the BB when TCP is in the Equilibrium state.
This TCP methodology provides guidelines to measure the maximum This TCP methodology provides guidelines to measure the maximum
achievable TCP Throughput when TCP is in the Equilibrium state. achievable TCP Throughput when TCP is in the Equilibrium state. All
All maximum achievable TCP Throughputs specified in Section 3.3 are maximum achievable TCP Throughputs specified in Section 3.3 are with
with respect to this condition. respect to this condition.
It is important to clarify the interaction between the sender's Send It is important to clarify the interaction between the sender's Send
Socket Buffer and the receiver's advertised TCP RWND Size. TCP test Socket Buffer and the receiver's advertised TCP RWND size. TCP test
programs such as iperf, ttcp, etc. allows the sender to control the programs such as "iperf", "ttcp", etc. allow the sender to control
quantity of TCP Bytes transmitted and unacknowledged (in-flight), the quantity of TCP Bytes transmitted and unacknowledged (in-flight),
commonly referred to as the Send Socket Buffer. This is done commonly referred to as the Send Socket Buffer. This is done
independently of the TCP RWND Size advertised by the receiver. independently of the TCP RWND size advertised by the receiver.
2. Scope and Goals 2. Scope and Goals
Before defining the goals, it is important to clearly define the Before defining the goals, it is important to clearly define the
areas that are out-of-scope. areas that are out of scope.
- This methodology is not intended to predict the TCP Throughput - This methodology is not intended to predict the TCP Throughput
during the transient stages of a TCP connection, such as during the during the transient stages of a TCP connection, such as during the
slow start phase. Slow Start phase.
- This methodology is not intended to definitively benchmark TCP - This methodology is not intended to definitively benchmark TCP
implementations of one OS to another, although some users may find implementations of one OS to another, although some users may find
value in conducting qualitative experiments. value in conducting qualitative experiments.
- This methodology is not intended to provide detailed diagnosis - This methodology is not intended to provide detailed diagnosis of
of problems within end-points or within the network itself as problems within endpoints or within the network itself as related
related to non-optimal TCP performance, although results to non-optimal TCP performance, although results interpretation for
interpretation for each test step may provide insights to potential each test step may provide insights to potential issues.
issues.
- This methodology does not propose to operate permanently with high - This methodology does not propose to operate permanently with high
measurement loads. TCP performance and optimization within measurement loads. TCP performance and optimization within
operational networks MAY be captured and evaluated by using data operational networks MAY be captured and evaluated by using data
from the "TCP Extended Statistics MIB" [RFC4898]. from the "TCP Extended Statistics MIB" [RFC4898].
In contrast to the above exclusions, the primary goal is to define a In contrast to the above exclusions, the primary goal is to define a
method to conduct a practical end-to-end assessment of sustained method to conduct a practical end-to-end assessment of sustained TCP
TCP performance within a managed business class IP network. Another performance within a managed business-class IP network. Another key
key goal is to establish a set of "best practices" that a non-TCP goal is to establish a set of "best practices" that a non-TCP expert
expert SHOULD apply when validating the ability of a managed IP SHOULD apply when validating the ability of a managed IP network to
network to carry end-user TCP applications. carry end-user TCP applications.
Specific goals are to: Specific goals are to:
- Provide a practical test approach that specifies tunable parameters - Provide a practical test approach that specifies tunable parameters
(such as MTU (Maximum Transmit Unit) and Socket Buffer sizes) and how (such as MTU (Maximum Transmission Unit) and Socket Buffer sizes)
these affect the outcome of TCP performances over an IP network. and how these affect the outcome of TCP performance over an IP
network.
- Provide specific test conditions like link speed, RTT, MTU, Socket - Provide specific test conditions such as link speed, RTT, MTU,
Buffer sizes and achievable TCP Throughput when TCP is in the Socket Buffer sizes, and achievable TCP Throughput when TCP is in
Equilibrium state. For guideline purposes, provide examples of the Equilibrium state. For guideline purposes, provide examples of
test conditions and their maximum achievable TCP Throughput. test conditions and their maximum achievable TCP Throughput.
Section 1.2 provides specific details concerning the definition of Section 1.3 provides specific details concerning the definition of
TCP Equilibrium within this methodology while Section 3 provides TCP Equilibrium within this methodology, while Section 3 provides
specific test conditions with examples. specific test conditions with examples.
- Define three (3) basic metrics to compare the performance of TCP - Define three (3) basic metrics to compare the performance of TCP
connections under various network conditions. See Section 4. connections under various network conditions. See Section 4.
- In test situations where the recommended procedure does not yield - Provide some areas within the end host or the network that SHOULD
the maximum achievable TCP Throughput, this methodology provides be considered for investigation in test situations where the
some areas within the end host or the network that SHOULD be recommended procedure does not yield the maximum achievable TCP
considered for investigation. Although again, this methodology Throughput. However, this methodology is not intended to provide
is not intended to provide detailed diagnosis on these issues. detailed diagnosis on these issues. See Section 5.2.
See Section 5.2.
3. Methodology 3. Methodology
This methodology is intended for operational and managed IP networks. This methodology is intended for operational and managed IP networks.
A multitude of network architectures and topologies can be tested. A multitude of network architectures and topologies can be tested.
The diagram in Figure 1.1 is very general and is only there to The diagram in Figure 1.2 is very general and is only provided to
illustrate typical segmentation within end-user and network provider illustrate typical segmentation within end-user and network provider
domains. domains.
Also, as stated earlier in Section 1, it is considered best practice Also, as stated in Section 1, it is considered best practice to
to verify the integrity of the network by conducting Layer 2/3 tests verify the integrity of the network by conducting Layer 2/3 tests
such as [RFC2544] or other methods of network stress tests. such as [RFC2544] or other methods of network stress tests; although
Although, it is important to mention here that [RFC2544] was never it is important to mention here that [RFC2544] was never meant to be
meant to be used outside a lab environment. used outside a lab environment.
It is not possible to make an accurate TCP Throughput measurement It is not possible to make an accurate TCP Throughput measurement
when the network is dysfunctional. In particular, if the network is when the network is dysfunctional. In particular, if the network is
exhibiting high packet loss and/or high jitter, then TCP Layer exhibiting high packet loss and/or high jitter, then TCP Layer
Throughput testing will not be meaningful. As a guideline 5% packet Throughput testing will not be meaningful. As a guideline, 5% packet
loss and/or 150 ms of jitter may be considered too high for an loss and/or 150 ms of jitter may be considered too high for an
accurate measurement. accurate measurement.
TCP Throughput testing may require cooperation between the end-user TCP Throughput testing may require cooperation between the end-user
customer and the network provider. As an example, in an MPLS (Multi- customer and the network provider. As an example, in an MPLS
Protocol Label Switching) network architecture, the testing SHOULD be (Multiprotocol Label Switching) network architecture, the testing
conducted either on the CPE or on the CE device and not on the PE SHOULD be conducted either on the CPE or on the CE device and not on
(Provider Edge) router. the PE (Provider Edge) router.
The following represents the sequential order of steps for this The following represents the sequential order of steps for this
testing methodology: testing methodology:
1 - Identify the Path MTU. Packetization Layer Path MTU Discovery 1. Identify the Path MTU. Packetization Layer Path MTU Discovery
or PLPMTUD, [RFC4821], SHOULD be conducted. It is important to (PLPMTUD) [RFC4821] SHOULD be conducted. It is important to
identify the path MTU so that the TCP TTD is configured properly to identify the path MTU so that the TCP TTD is configured properly
avoid fragmentation. to avoid fragmentation.
2 - Baseline Round Trip Time and Bandwidth. This step establishes the 2. Baseline Round-Trip Time and Bandwidth. This step establishes the
inherent, non-congested Round Trip Time (RTT) and the Bottleneck inherent, non-congested Round-Trip Time (RTT) and the Bottleneck
Bandwidth (BB) of the end-to-end network path. These measurements Bandwidth (BB) of the end-to-end network path. These measurements
are used to provide estimates of the TCP RWND and Send Socket Buffer are used to provide estimates of the TCP RWND and Send Socket
Sizes that SHOULD be used during subsequent test steps. Buffer sizes that SHOULD be used during subsequent test steps.
3 - TCP Connection Throughput Tests. With baseline measurements 3. TCP Connection Throughput Tests. With baseline measurements of
of Round Trip Time and Bottleneck Bandwidth, single and multiple TCP Round-Trip Time and Bottleneck Bandwidth, single- and multiple-
connection throughput tests SHOULD be conducted to baseline network TCP-connection throughput tests SHOULD be conducted to baseline
performances. network performance.
These three (3) steps are detailed in Sections 3.1 - 3.3. These three (3) steps are detailed in Sections 3.1 to 3.3.
Important to note are some of the key characteristics and Important to note are some of the key characteristics and
considerations for the TCP test instrument. The test host MAY be a considerations for the TCP test instrument. The test host MAY be a
standard computer or a dedicated communications test instrument. standard computer or a dedicated communications test instrument. In
In both cases, it MUST be capable of emulating both a client and a both cases, it MUST be capable of emulating both a client and a
server. server.
The following criteria SHOULD be considered when selecting whether The following criteria SHOULD be considered when selecting whether
the TCP test host can be a standard computer or has to be a dedicated the TCP test host can be a standard computer or has to be a dedicated
communications test instrument: communications test instrument:
- TCP implementation used by the test host, OS version, i.e. LINUX OS - TCP implementation used by the test host, OS version (e.g., LINUX
kernel using TCP New Reno, TCP options supported, etc. These will OS kernel using TCP New Reno), TCP options supported, etc. will
obviously be more important when using dedicated communications test obviously be more important when using dedicated communications
instruments where the TCP implementation may be customized or tuned test instruments where the TCP implementation may be customized or
to run in higher performance hardware. When a compliant TCP TTD is tuned to run in higher-performance hardware. When a compliant TCP
used, the TCP implementation SHOULD be identified in the test TTD is used, the TCP implementation SHOULD be identified in the
results. The compliant TCP TTD SHOULD be usable for complete test results. The compliant TCP TTD SHOULD be usable for complete
end-to-end testing through network security elements and SHOULD also end-to-end testing through network security elements and SHOULD
be usable for testing network sections. also be usable for testing network sections.
- More important, the TCP test host MUST be capable to generate - More importantly, the TCP test host MUST be capable of generating
and receive stateful TCP test traffic at the full BB of the NUT. and receiving stateful TCP test traffic at the full BB of the NUT.
Stateful TCP test traffic means that the test host MUST fully Stateful TCP test traffic means that the test host MUST fully
implement a TCP/IP stack; this is generally a comment aimed at implement a TCP/IP stack; this is generally a comment aimed at
dedicated communications test equipments which sometimes "blast" dedicated communications test equipment that sometimes "blasts"
packets with TCP headers. As a general rule of thumb, testing TCP packets with TCP headers. At the time of this publication, testing
Throughput at rates greater than 100 Mbps may require high TCP Throughput at rates greater than 100 Mbps may require high-
performance server hardware or dedicated hardware based test tools. performance server hardware or dedicated hardware-based test tools.
- A compliant TCP Throughput Test Device MUST allow adjusting both - A compliant TCP Throughput Test Device MUST allow adjusting both
Send and Receive Socket Buffer sizes. The Socket Buffers MUST be Send and Receive Socket Buffer sizes. The Socket Buffers MUST be
large enough to fill the BDP. large enough to fill the BDP.
- Measuring RTT and retransmissions per connection will generally - Measuring RTT and retransmissions per connection will generally
require a dedicated communications test instrument. In the absence of require a dedicated communications test instrument. In the absence
dedicated hardware based test tools, these measurements may need to of dedicated hardware-based test tools, these measurements may need
be conducted with packet capture tools, i.e. conduct TCP Throughput to be conducted with packet capture tools, i.e., conduct TCP
tests and analyze RTT and retransmissions in packet captures. Throughput tests and analyze RTT and retransmissions in packet
Another option MAY be to use "TCP Extended Statistics MIB" per captures. Another option MAY be to use the "TCP Extended
[RFC4898]. Statistics MIB" [RFC4898].
- The [RFC4821] PLPMTUD test SHOULD be conducted with a dedicated - The [RFC4821] PLPMTUD test SHOULD be conducted with a dedicated
tester which exposes the ability to run the PLPMTUD algorithm tester that exposes the ability to run the PLPMTUD algorithm
independently from the OS stack. independently from the OS stack.
3.1. Path MTU 3.1. Path MTU
TCP implementations should use Path MTU Discovery techniques (PMTUD). TCP implementations should use Path MTU Discovery techniques (PMTUD).
PMTUD relies on ICMP 'need to frag' messages to learn the path MTU. PMTUD relies on ICMP 'need to frag' messages to learn the path MTU.
When a device has a packet to send which has the Don't Fragment (DF) When a device has a packet to send that has the Don't Fragment (DF)
bit in the IP header set and the packet is larger than the (MTU) of bit in the IP header set and the packet is larger than the MTU of the
the next hop, the packet is dropped and the device sends an ICMP next hop, the packet is dropped, and the device sends an ICMP 'need
'need to frag' message back to the host that originated the packet. to frag' message back to the host that originated the packet. The
The ICMP 'need to frag' message includes the next hop MTU which PMTUD ICMP 'need to frag' message includes the next-hop MTU, which PMTUD
uses to adjust itself. Unfortunately, because many network managers uses to adjust itself. Unfortunately, because many network managers
completely disable ICMP, this technique does not always prove completely disable ICMP, this technique does not always prove
reliable. reliable.
Packetization Layer Path MTU Discovery or PLPMTUD [RFC4821] MUST then Packetization Layer Path MTU Discovery (PLPMTUD) [RFC4821] MUST then
be conducted to verify the network path MTU. PLPMTUD can be used be conducted to verify the network path MTU. PLPMTUD can be used
with or without ICMP. [RFC4821] specifies search_high and search_low with or without ICMP. [RFC4821] specifies search_high and search_low
parameters for the MTU and we recommend to use those. The goal is to parameters for the MTU, and we recommend using those parameters. The
avoid fragmentation during all subsequent tests. goal is to avoid fragmentation during all subsequent tests.
3.2. Round Trip Time (RTT) and Bottleneck Bandwidth (BB) 3.2. Round-Trip Time (RTT) and Bottleneck Bandwidth (BB)
Before stateful TCP testing can begin, it is important to determine Before stateful TCP testing can begin, it is important to determine
the baseline RTT (i.e. non-congested inherent delay) and BB of the the baseline RTT (i.e., non-congested inherent delay) and BB of the
end-to-end network to be tested. These measurements are used to end-to-end network to be tested. These measurements are used to
calculate the BDP and to provide estimates of the TCP RWND and calculate the BDP and to provide estimates of the TCP RWND and Send
Send Socket Buffer Sizes that SHOULD be used in subsequent test Socket Buffer sizes that SHOULD be used in subsequent test steps.
steps.
3.2.1 Measuring RTT 3.2.1. Measuring RTT
As previously defined in Section 1.1, RTT is the elapsed time As previously defined in Section 1.2, RTT is the elapsed time between
between the clocking in of the first bit of a TCP segment sent the clocking in of the first bit of a TCP segment sent and the
and the receipt of the last bit of the corresponding TCP receipt of the last bit of the corresponding TCP Acknowledgment.
Acknowledgment.
The RTT SHOULD be baselined during off-peak hours in order to obtain The RTT SHOULD be baselined during off-peak hours in order to obtain
a reliable figure of the inherent network latency. Otherwise, a reliable figure of the inherent network latency. Otherwise,
additional delay caused by network buffering can occur. Also, when additional delay caused by network buffering can occur. Also, when
sampling RTT values over a given test interval, the minimum sampling RTT values over a given test interval, the minimum measured
measured value SHOULD be used as the baseline RTT. This will most value SHOULD be used as the baseline RTT. This will most closely
closely estimate the real inherent RTT. This value is also used to estimate the real inherent RTT. This value is also used to determine
determine the Buffer Delay Percentage metric defined in Section 4.3. the Buffer Delay Percentage metric defined in Section 4.3.
The following list is not meant to be exhaustive, although it The following list is not meant to be exhaustive, although it
summarizes some of the most common ways to determine Round Trip Time. summarizes some of the most common ways to determine Round-Trip Time.
The desired measurement precision (i.e. ms versus us) may dictate The desired measurement precision (i.e., ms versus us) may dictate
whether the RTT measurement can be achieved with ICMP pings or by a whether the RTT measurement can be achieved with ICMP pings or by a
dedicated communications test instrument with precision timers. The dedicated communications test instrument with precision timers. The
objective in this section is to list several techniques in order of objective of this section is to list several techniques in order of
decreasing accuracy. decreasing accuracy.
- Use test equipment on each end of the network, "looping" the - Use test equipment on each end of the network, "looping" the far-
far-end tester so that a packet stream can be measured back and forth end tester so that a packet stream can be measured back and forth
from end-to-end. This RTT measurement may be compatible with delay from end to end. This RTT measurement may be compatible with delay
measurement protocols specified in [RFC5357]. measurement protocols specified in [RFC5357].
- Conduct packet captures of TCP test sessions using "iperf" or FTP, - Conduct packet captures of TCP test sessions using "iperf" or FTP,
or other TCP test applications. By running multiple experiments, or other TCP test applications. By running multiple experiments,
packet captures can then be analyzed to estimate RTT. It is packet captures can then be analyzed to estimate RTT. It is
important to note that results based upon the SYN -> SYN-ACK at the important to note that results based upon the SYN -> SYN-ACK at the
beginning of TCP sessions SHOULD be avoided since Firewalls might beginning of TCP sessions SHOULD be avoided, since Firewalls might
slow down 3 way handshakes. Also, at the senders side, Ostermann's slow down 3-way handshakes. Also, at the sender's side,
LINUX TCPTRACE utility with -l -r arguments can be used to extract Ostermann's LINUX TCPTRACE utility with -l -r arguments can be used
the RTT results directly from the packet captures. to extract the RTT results directly from the packet captures.
- Obtain RTT statistics available from MIBs defined in [RFC4898]. - Obtain RTT statistics available from MIBs defined in [RFC4898].
- ICMP pings may also be adequate to provide Round Trip Time - ICMP pings may also be adequate to provide Round-Trip Time
estimates, provided that the packet size is factored into the estimates, provided that the packet size is factored into the
estimates (i.e. pings with different packet sizes might be required). estimates (i.e., pings with different packet sizes might be
Some limitations with ICMP Ping may include ms resolution and required). Some limitations with ICMP ping may include ms
whether the network elements are responding to pings or not. Also, resolution and whether or not the network elements are responding
ICMP is often rate-limited or segregated into different buffer to pings. Also, ICMP is often rate-limited or segregated into
queues. ICMP might not work if QoS (Quality of Service) different buffer queues. ICMP might not work if QoS (Quality of
reclassification is done at any hop. ICMP is not as reliable and Service) reclassification is done at any hop. ICMP is not as
accurate as in-band measurements. reliable and accurate as in-band measurements.
3.2.2 Measuring BB 3.2.2. Measuring BB
Before any TCP Throughput test can be conducted, bandwidth Before any TCP Throughput test can be conducted, bandwidth
measurement tests SHOULD be run with stateless IP streams (i.e. not measurement tests SHOULD be run with stateless IP streams (i.e., not
stateful TCP) in order to determine the BB of the NUT. stateful TCP) in order to determine the BB of the NUT. These
These measurements SHOULD be conducted in both directions, measurements SHOULD be conducted in both directions, especially in
especially in asymmetrical access networks (e.g. ADSL access). These asymmetrical access networks (e.g., Asymmetric Bit-Rate DSL (ADSL)
tests SHOULD be performed at various intervals throughout a business access). These tests SHOULD be performed at various intervals
day or even across a week. throughout a business day or even across a week.
Testing at various time intervals would provide a better Testing at various time intervals would provide a better
characterization of TCP throughput and better diagnosis insight (for characterization of TCP Throughput and better diagnosis insight (for
cases where there are TCP performance issues). The bandwidth tests cases where there are TCP performance issues). The bandwidth tests
SHOULD produce logged outputs of the achieved bandwidths across the SHOULD produce logged outputs of the achieved bandwidths across the
complete test duration. complete test duration.
There are many well established techniques available to provide There are many well-established techniques available to provide
estimated measures of bandwidth over a network. It is a common estimated measures of bandwidth over a network. It is a common
practice for network providers to conduct Layer 2/3 bandwidth practice for network providers to conduct Layer 2/3 bandwidth
capacity tests using [RFC2544], although it is understood that capacity tests using [RFC2544], although it is understood that
[RFC2544] was never meant to be used outside a lab environment. [RFC2544] was never meant to be used outside a lab environment.
These bandwidth measurements SHOULD use network capacity These bandwidth measurements SHOULD use network capacity techniques
techniques as defined in [RFC5136]. as defined in [RFC5136].
3.3. Measuring TCP Throughput 3.3. Measuring TCP Throughput
This methodology specifically defines TCP Throughput measurement This methodology specifically defines TCP Throughput measurement
techniques to verify maximum achievable TCP performance in a managed techniques to verify maximum achievable TCP performance in a managed
business class IP network. business-class IP network.
With baseline measurements of RTT and BB from Section 3.2, a series With baseline measurements of RTT and BB from Section 3.2, a series
of single and / or multiple TCP connection throughput tests SHOULD of single- and/or multiple-TCP-connection throughput tests SHOULD be
be conducted. conducted.
The number of trials and single versus multiple TCP connections The number of trials and the choice between single or multiple TCP
choice will be based on the intention of the test. A single TCP connections will be based on the intention of the test. A single-
connection test might be enough to measure the achievable throughput TCP-connection test might be enough to measure the achievable
of a Metro Ethernet connectivity. Although, it is important to note throughput of Metro Ethernet connectivity. However, it is important
that various traffic management techniques can be used in an IP to note that various traffic management techniques can be used in an
network and that some of those can only be tested with multiple IP network and that some of those techniques can only be tested with
connections. As an example, multiple TCP sessions might be required multiple connections. As an example, multiple TCP sessions might be
to detect traffic shaping versus policing. Multiple sessions might required to detect traffic shaping versus policing. Multiple
also be needed to measure Active Queue Management performances. sessions might also be needed to measure Active Queue Management
However, traffic management testing is not within the scope of this performance. However, traffic management testing is not within the
test methodology. scope of this test methodology.
In all circumstances, it is RECOMMENDED to run the tests in each In all circumstances, it is RECOMMENDED to run the tests in each
direction independently first and then to run in both directions direction independently first and then to run them in both directions
simultaneously. It is also RECOMMENDED to run the tests at simultaneously. It is also RECOMMENDED to run the tests at different
different times of day. times of the day.
In each case, the TCP Transfer Time Ratio, the TCP Efficiency In each case, the TCP Transfer Time Ratio, the TCP Efficiency
Percentage, and the Buffer Delay Percentage MUST be measured in Percentage, and the Buffer Delay Percentage MUST be measured in each
each direction. These 3 metrics are defined in Section 4. direction. These 3 metrics are defined in Section 4.
3.3.1 Minimum TCP RWND 3.3.1. Minimum TCP RWND
The TCP TTD MUST allow the Send Socket Buffer and Receive Window The TCP TTD MUST allow the Send Socket Buffer and Receive Window
sizes to be set higher than the BDP, otherwise TCP performance will sizes to be set higher than the BDP; otherwise, TCP performance will
be limited. In the business customer environment, these settings are be limited. In the business customer environment, these settings are
not generally adjustable by the average user. These settings are not generally adjustable by the average user. These settings are
either hard coded in the application or configured within the OS as either hard-coded in the application or configured within the OS as
part of a corporate image. In many cases, the user's host Send part of a corporate image. In many cases, the user's host Send
Socket Buffer and Receive Window size settings are not optimal. Socket Buffer and Receive Window size settings are not optimal.
This section provides derivations of BDPs under various network This section provides derivations of BDPs under various network
conditions. It also provides examples of achievable TCP Throughput conditions. It also provides examples of achievable TCP Throughput
with various TCP RWND sizes. This provides important guidelines with various TCP RWND sizes. This provides important guidelines
showing what can be achieved with settings higher than the BDP, showing what can be achieved with settings higher than the BDP,
versus what would be achieved in a variety of real world conditions. versus what would be achieved in a variety of real-world conditions.
The minimum required TCP RWND Size can be calculated from the The minimum required TCP RWND size can be calculated from the
Bandwidth Delay Product (BDP), which is: Bandwidth-Delay Product (BDP), which is as follows:
BDP (bits) = RTT (sec) x BB (bps) BDP (bits) = RTT (sec) X BB (bps)
Note that the RTT is being used as the "Delay" variable for the BDP.
Note that the RTT is being used as the "Delay" variable for the BDP.
Then, by dividing the BDP by 8, we obtain the minimum required TCP Then, by dividing the BDP by 8, we obtain the minimum required TCP
RWND Size in Bytes. For optimal results, the Send Socket Buffer RWND size in Bytes. For optimal results, the Send Socket Buffer MUST
MUST be adjusted to the same value at each end of the network. be adjusted to the same value at each end of the network.
Minimum required TCP RWND = BDP / 8 Minimum required TCP RWND = BDP / 8
As an example on a T3 link with 25 ms RTT, the BDP would equal As an example, on a T3 link with 25-ms RTT, the BDP would equal
~1,105,000 bits and the minimum required TCP RWND would be ~138 KB. ~1,105,000 bits, and the minimum required TCP RWND would be ~138 KB.
Note that separate calculations are REQUIRED on asymmetrical paths. Note that separate calculations are REQUIRED on asymmetrical paths.
An asymmetrical path example would be a 90 ms RTT ADSL line with An asymmetrical-path example would be a 90-ms RTT ADSL line with 5
5Mbps downstream and 640Kbps upstream. The downstream BDP would equal Mbps downstream and 640 Kbps upstream. The downstream BDP would
~450,000 bits while the upstream one would be only ~57,600 bits. equal ~450,000 bits, while the upstream one would be only
~57,600 bits.
The following table provides some representative network Link Speeds,
RTT, BDP, and their associated minimum required TCP RWND Sizes.
Table 3.3.1: Link Speed, RTT, calculated BDP & min. TCP RWND The following table provides some representative network link speeds,
RTT, BDP, and their associated minimum required TCP RWND sizes.
Link Minimum required Link Minimum Required
Speed* RTT BDP TCP RWND Speed* RTT BDP TCP RWND
(Mbps) (ms) (bits) (KBytes) (Mbps) (ms) (bits) (KBytes)
--------------------------------------------------------------------- --------------------------------------------------------------------
1.536 20.00 30,720 3.84 1.536 20.00 30,720 3.84
1.536 50.00 76,800 9.60 1.536 50.00 76,800 9.60
1.536 100.00 153,600 19.20 1.536 100.00 153,600 19.20
44.210 10.00 442,100 55.26 44.210 10.00 442,100 55.26
44.210 15.00 663,150 82.89 44.210 15.00 663,150 82.89
44.210 25.00 1,105,250 138.16 44.210 25.00 1,105,250 138.16
100.000 1.00 100,000 12.50 100.000 1.00 100,000 12.50
100.000 2.00 200,000 25.00 100.000 2.00 200,000 25.00
100.000 5.00 500,000 62.50 100.000 5.00 500,000 62.50
1,000.000 0.10 100,000 12.50 1,000.000 0.10 100,000 12.50
1,000.000 0.50 500,000 62.50 1,000.000 0.50 500,000 62.50
1,000.000 1.00 1,000,000 125.00 1,000.000 1.00 1,000,000 125.00
10,000.000 0.05 500,000 62.50 10,000.000 0.05 500,000 62.50
10,000.000 0.30 3,000,000 375.00 10,000.000 0.30 3,000,000 375.00
* Note that link speed is the BB for the NUT * Note that link speed is the BB for the NUT
Table 3.3.1. Link Speed, RTT, Calculated BDP, and Minimum TCP RWND
In the above table, the following serial link speeds are used: In the above table, the following serial link speeds are used:
- T1 = 1.536 Mbps (for a B8ZS line encoding facility)
- T3 = 44.21 Mbps (for a C-Bit Framing facility)
The previous table illustrates the minimum required TCP RWND. - T1 = 1.536 Mbps (for a B8ZS line encoding facility)
If a smaller TCP RWND Size is used, then the TCP Throughput - T3 = 44.21 Mbps (for a C-Bit framing facility)
can not be optimal. To calculate the TCP Throughput, the following
formula is used: TCP Throughput = TCP RWND X 8 / RTT
An example could be a 100 Mbps IP path with 5 ms RTT and a TCP RWND The previous table illustrates the minimum required TCP RWND. If a
of 16KB, then: smaller TCP RWND size is used, then the TCP Throughput cannot be
optimal. To calculate the TCP Throughput, the following formula is
used:
TCP Throughput = 16 KBytes X 8 bits / 5 ms. TCP Throughput = TCP RWND X 8 / RTT
TCP Throughput = 128,000 bits / 0.005 sec.
TCP Throughput = 25.6 Mbps.
Another example for a T3 using the same calculation formula is An example could be a 100-Mbps IP path with 5-ms RTT and a TCP RWND
of 16 KB; then:
TCP Throughput = 16 KBytes X 8 bits / 5 ms
TCP Throughput = 128,000 bits / 0.005 sec
TCP Throughput = 25.6 Mbps
Another example, for a T3 using the same calculation formula, is
illustrated in Figure 3.3.1a: illustrated in Figure 3.3.1a:
TCP Throughput = 16 KBytes X 8 bits / 10 ms. TCP Throughput = 16 KBytes X 8 bits / 10 ms
TCP Throughput = 128,000 bits / 0.01 sec. TCP Throughput = 128,000 bits / 0.01 sec
TCP Throughput = 12.8 Mbps. * TCP Throughput = 12.8 Mbps*
When the TCP RWND Size exceeds the BDP (T3 link and 64 KBytes TCP When the TCP RWND size exceeds the BDP (T3 link and 64-KByte TCP RWND
RWND on a 10 ms RTT path), the maximum frames per second limit of on a 10-ms RTT path), the maximum Frames Per Second (FPS) limit of
3664 is reached and then the formula is: 3664 is reached, and then the formula is:
TCP Throughput = Max FPS X (MTU - 40) X 8. TCP Throughput = max FPS X (MTU - 40) X 8
TCP Throughput = 3664 FPS X 1460 Bytes X 8 bits. TCP Throughput = 3664 FPS X 1460 Bytes X 8 bits
TCP Throughput = 42.8 Mbps. ** TCP Throughput = 42.8 Mbps**
The following diagram compares achievable TCP Throughputs on a T3 The following diagram compares achievable TCP Throughputs on a T3
with Send Socket Buffer & TCP RWND Sizes of 16KB vs. 64KB. with Send Socket Buffer and TCP RWND sizes of 16 KB versus 64 KB.
Figure 3.3.1a TCP Throughputs on a T3 at different RTTs 45|
| _______**42.8
40| |64KB |
TCP | | |
Through- 35| | |
put | | | +-----+34.1
(Mbps) 30| | | |64KB |
| | | | |
25| | | | |
| | | | |
20| | | | | _______20.5
| | | | | |64KB |
15| | | | | | |
|*12.8+-----| | | | | |
10| |16KB | | | | | |
| | | |8.5 +-----| | | |
5| | | | |16KB | |5.1 +-----| |
|_____|_____|_____|____|_____|_____|____|16KB |_____|____
10 15 25
RTT (milliseconds)
45| Figure 3.3.1a. TCP Throughputs on a T3 at Different RTTs
| _______**42.8
40| |64KB |
TCP | | |
Throughput 35| | |
in Mbps | | | +-----+34.1
30| | | |64KB |
| | | | |
25| | | | |
| | | | |
20| | | | | _______20.5
| | | | | |64KB |
15| | | | | | |
|*12.8+-----| | | | | |
10| |16KB | | | | | |
| | | |8.5 +-----| | | |
5| | | | |16KB | |5.1 +-----| |
|_____|_____|_____|____|_____|_____|____|16KB |_____|_____
10 15 25
RTT in milliseconds
The following diagram shows the achievable TCP Throughput on a 25 ms The following diagram shows the achievable TCP Throughput on a 25-ms
T3 when Send Socket Buffer & TCP RWND Sizes are increased. T3 when Send Socket Buffer and TCP RWND sizes are increased.
Figure 3.3.1b TCP Throughputs on a T3 with different TCP RWND 45|
|
40| +-----+40.9
TCP | | |
Through- 35| | |
put | | |
(Mbps) 30| | |
| | |
25| | |
| | |
20| +-----+20.5 | |
| | | | |
15| | | | |
| | | | |
10| +-----+10.2 | | | |
| | | | | | |
5| +-----+5.1 | | | | | |
|_____|_____|______|_____|______|_____|______|_____|_____
16 32 64 128*
TCP RWND Size (KBytes)
45| * Note that 128 KB requires the [RFC1323] TCP Window Scale option.
|
40| +-----+40.9
TCP | | |
Throughput 35| | |
in Mbps | | |
30| | |
| | |
25| | |
| | |
20| +-----+20.5 | |
| | | | |
15| | | | |
| | | | |
10| +-----+10.2 | | | |
| | | | | | |
5| +-----+5.1 | | | | | |
|_____|_____|______|_____|______|_____|_______|_____|_____
16 32 64 128*
TCP RWND Size in KBytes
* Note that 128KB requires [RFC1323] TCP Window scaling option. Figure 3.3.1b. TCP Throughputs on a T3 with Different TCP RWND
4. TCP Metrics 4. TCP Metrics
This methodology focuses on a TCP Throughput and provides 3 basic This methodology focuses on a TCP Throughput and provides 3 basic
metrics that can be used for better understanding of the results. metrics that can be used for better understanding of the results. It
It is recognized that the complexity and unpredictability of TCP is recognized that the complexity and unpredictability of TCP makes
makes it very difficult to develop a complete set of metrics that it very difficult to develop a complete set of metrics that accounts
accounts for the myriad of variables (i.e. RTT variations, loss for the myriad of variables (i.e., RTT variations, loss conditions,
conditions, TCP implementations, etc.). However, these 3 metrics TCP implementations, etc.). However, these 3 metrics facilitate TCP
facilitate TCP Throughput comparisons under varying network Throughput comparisons under varying network conditions and host
conditions and host buffer size / RWND settings. buffer size/RWND settings.
4.1 Transfer Time Ratio 4.1. Transfer Time Ratio
The first metric is the TCP Transfer Time Ratio, which is simply the The first metric is the TCP Transfer Time Ratio, which is simply the
ratio between the Actual versus the Ideal TCP Transfer Times. ratio between the Actual TCP Transfer Time versus the Ideal TCP
Transfer Time.
The Actual TCP Transfer Time, is simply the time it takes to transfer The Actual TCP Transfer Time is simply the time it takes to transfer
a block of data across TCP connection(s). a block of data across TCP connection(s).
The Ideal TCP Transfer Time is the predicted time for which a block The Ideal TCP Transfer Time is the predicted time for which a block
of data SHOULD transfer across TCP connection(s) considering the BB of data SHOULD transfer across TCP connection(s), considering the BB
of the NUT. of the NUT.
Actual TCP Transfer Time Actual TCP Transfer Time
TCP Transfer Time Ratio = ------------------------- TCP Transfer Time Ratio = -------------------------
Ideal TCP Transfer Time Ideal TCP Transfer Time
The Ideal TCP Transfer Time is derived from the Maximum Achievable The Ideal TCP Transfer Time is derived from the Maximum Achievable
TCP Throughput, which is related to the BB and Layer 1/2/3/4 TCP Throughput, which is related to the BB and Layer 1/2/3/4
overheads associated with the network path. The following sections overheads associated with the network path. The following sections
provide derivations for the Maximum Achievable TCP Throughput and provide derivations for the Maximum Achievable TCP Throughput and
example calculations for the TCP Transfer Time Ratio. example calculations for the TCP Transfer Time Ratio.
4.1.1 Maximum Achievable TCP Throughput calculation 4.1.1. Maximum Achievable TCP Throughput Calculation
This section provides formulas to calculate the Maximum Achievable This section provides formulas to calculate the Maximum Achievable
TCP Throughput with examples for T3 (44.21 Mbps) and Ethernet. TCP Throughput, with examples for T3 (44.21 Mbps) and Ethernet.
All calculations are based on IP version 4 with TCP/IP headers of All calculations are based on IP version 4 with TCP/IP headers of 20
20 Bytes each (20 for TCP + 20 for IP) within an MTU of 1500 Bytes. Bytes each (20 for TCP + 20 for IP) within an MTU of 1500 Bytes.
First, the maximum achievable Layer 2 throughput of a T3 Interface First, the maximum achievable Layer 2 throughput of a T3 interface is
is limited by the maximum quantity of Frames Per Second (FPS) limited by the maximum quantity of Frames Per Second (FPS) permitted
permitted by the actual physical layer (Layer 1) speed. by the actual physical layer (Layer 1) speed.
The calculation formula is: The calculation formula is:
FPS = T3 Physical Speed / ((MTU + PPP + Flags + CRC16) X 8)
FPS = (44.21Mbps /((1500 Bytes + 4 Bytes + 2 Bytes + 2 Bytes) X 8 ))) FPS = T3 Physical Speed / ((MTU + PPP + Flags + CRC16) X 8)
FPS = (44.21Mbps /(1508 Bytes X 8))
FPS = 44.21Mbps / 12064 bits FPS = (44.21 Mbps /
FPS = 3664 ((1500 Bytes + 4 Bytes + 2 Bytes + 2 Bytes) X 8 )))
FPS = (44.21 Mbps / (1508 Bytes X 8))
FPS = 44.21 Mbps / 12064 bits
FPS = 3664
Then, to obtain the Maximum Achievable TCP Throughput (Layer 4), we Then, to obtain the Maximum Achievable TCP Throughput (Layer 4), we
simply use: (MTU - 40) in Bytes X 8 bits X max FPS. simply use:
For a T3, the maximum TCP Throughput = 1460 Bytes X 8 bits X 3664 FPS (MTU - 40) in Bytes X 8 bits X max FPS
Maximum TCP Throughput = 11680 bits X 3664 FPS
Maximum TCP Throughput = 42.8 Mbps. For a T3, the maximum TCP Throughput =
1460 Bytes X 8 bits X 3664 FPS
Maximum TCP Throughput = 11680 bits X 3664 FPS
Maximum TCP Throughput = 42.8 Mbps
On Ethernet, the maximum achievable Layer 2 throughput is limited by On Ethernet, the maximum achievable Layer 2 throughput is limited by
the maximum Frames Per Second permitted by the IEEE802.3 standard. the maximum Frames Per Second permitted by the IEEE802.3 standard.
The maximum FPS for 100 Mbps Ethernet is 8127 and the calculation is: The maximum FPS for 100-Mbps Ethernet is 8127, and the calculation
FPS = (100Mbps /(1538 Bytes X 8 bits)) formula is:
The maximum FPS for GigE is 81274 and the calculation formula is: FPS = (100 Mbps / (1538 Bytes X 8 bits))
FPS = (1Gbps /(1538 Bytes X 8 bits))
The maximum FPS for 10GigE is 812743 and the calculation formula is: The maximum FPS for GigE is 81274, and the calculation formula is:
FPS = (10Gbps /(1538 Bytes X 8 bits))
FPS = (1 Gbps / (1538 Bytes X 8 bits))
The maximum FPS for 10GigE is 812743, and the calculation formula is:
FPS = (10 Gbps / (1538 Bytes X 8 bits))
The 1538 Bytes equates to: The 1538 Bytes equates to:
MTU + Ethernet + CRC32 + IFG + Preamble + SFD MTU + Ethernet + CRC32 + IFG + Preamble + SFD
(IFG = Inter-Frame Gap and SFD = Start of Frame Delimiter) (IFG = Inter-Frame Gap and SFD = Start of Frame Delimiter)
Where MTU is 1500 Bytes, Ethernet is 14 Bytes, CRC32 is 4 Bytes,
IFG is 12 Bytes, Preamble is 7 Bytes and SFD is 1 Byte. where MTU is 1500 Bytes, Ethernet is 14 Bytes, CRC32 is 4 Bytes, IFG
is 12 Bytes, Preamble is 7 Bytes, and SFD is 1 Byte.
Then, to obtain the Maximum Achievable TCP Throughput (Layer 4), we Then, to obtain the Maximum Achievable TCP Throughput (Layer 4), we
simply use: (MTU - 40) in Bytes X 8 bits X max FPS. simply use:
For a 100Mbps, the max TCP Throughput = 1460Bytes X 8 bits X 8127 FPS
Maximum TCP Throughput = 11680 bits X 8127 FPS
Maximum TCP Throughput = 94.9 Mbps.
It is important to note that better results could be obtained with (MTU - 40) in Bytes X 8 bits X max FPS
jumbo frames on Gigabit and 10 Gigabit Ethernet interfaces.
4.1.2 TCP Transfer Time and Transfer Time Ratio calculation For 100-Mbps Ethernet, the maximum TCP Throughput =
The following table illustrates the Ideal TCP Transfer time of a 1460 Bytes X 8 bits X 8127 FPS
single TCP connection when its TCP RWND and Send Socket Buffer Sizes
equals or exceeds the BDP.
Table 4.1.1: Link Speed, RTT, BDP, TCP Throughput, and Maximum TCP Throughput = 11680 bits X 8127 FPS
Ideal TCP Transfer time for a 100 MB File Maximum TCP Throughput = 94.9 Mbps
It is important to note that better results could be obtained with
jumbo frames on Gigabit and 10-Gigabit Ethernet interfaces.
4.1.2. TCP Transfer Time and Transfer Time Ratio Calculation
The following table illustrates the Ideal TCP Transfer Time of a
single TCP connection when its TCP RWND and Send Socket Buffer sizes
equal or exceed the BDP.
Link Maximum Ideal TCP Link Maximum Ideal TCP
Speed BDP Achievable TCP Transfer time Speed BDP Achievable TCP Transfer Time
(Mbps) RTT (ms) (KBytes) Throughput(Mbps) (seconds)* (Mbps) RTT (ms) (KBytes) Throughput(Mbps) (seconds)*
-------------------------------------------------------------------- --------------------------------------------------------------------
1.536 50.00 9.6 1.4 571.0 1.536 50.00 9.6 1.4 571.0
44.210 25.00 138.2 42.8 18.0 44.210 25.00 138.2 42.8 18.0
100.000 2.00 25.0 94.9 9.0 100.000 2.00 25.0 94.9 9.0
1,000.000 1.00 125.0 949.2 1.0 1,000.000 1.00 125.0 949.2 1.0
10,000.000 0.05 62.5 9,492.0 0.1 10,000.000 0.05 62.5 9,492.0 0.1
* Transfer times are rounded for simplicity. * Transfer times are rounded for simplicity.
For a 100MB file (100 x 8 = 800 Mbits), the Ideal TCP Transfer Time Table 4.1.2. Link Speed, RTT, BDP, TCP Throughput, and
Ideal TCP Transfer Time for a 100-MB File
For a 100-MB file (100 X 8 = 800 Mbits), the Ideal TCP Transfer Time
is derived as follows: is derived as follows:
800 Mbits 800 Mbits
Ideal TCP Transfer Time = ----------------------------------- Ideal TCP Transfer Time = -----------------------------------
Maximum Achievable TCP Throughput Maximum Achievable TCP Throughput
To illustrate the TCP Transfer Time Ratio, an example would be the To illustrate the TCP Transfer Time Ratio, an example would be the
bulk transfer of 100 MB over 5 simultaneous TCP connections (each bulk transfer of 100 MB over 5 simultaneous TCP connections (each
connection transferring 100 MB). In this example, the Ethernet connection transferring 100 MB). In this example, the Ethernet
service provides a Committed Access Rate (CAR) of 500 Mbps. Each service provides a Committed Access Rate (CAR) of 500 Mbps. Each
connection may achieve different throughputs during a test and the connection may achieve different throughputs during a test, and the
overall throughput rate is not always easy to determine (especially overall throughput rate is not always easy to determine (especially
as the number of connections increases). as the number of connections increases).
The ideal TCP Transfer Time would be ~8 seconds, but in this example, The Ideal TCP Transfer Time would be ~8 seconds, but in this example,
the actual TCP Transfer Time was 12 seconds. The TCP Transfer Time the Actual TCP Transfer Time was 12 seconds. The TCP Transfer Time
Ratio would then be 12/8 = 1.5, which indicates that the transfer Ratio would then be 12/8 = 1.5, which indicates that the transfer
across all connections took 1.5 times longer than the ideal. across all connections took 1.5 times longer than the ideal.
4.2 TCP Efficiency 4.2. TCP Efficiency
The second metric represents the percentage of Bytes that were not The second metric represents the percentage of Bytes that were not
retransmitted. retransmitted.
Transmitted Bytes - Retransmitted Bytes Transmitted Bytes - Retransmitted Bytes
TCP Efficiency % = --------------------------------------- X 100 TCP Efficiency % = --------------------------------------- X 100
Transmitted Bytes Transmitted Bytes
Transmitted Bytes are the total number of TCP Bytes to be transmitted Transmitted Bytes are the total number of TCP Bytes to be
including the original and the retransmitted Bytes. transmitted, including the original and the retransmitted Bytes.
4.2.1 TCP Efficiency Percentage calculation 4.2.1. TCP Efficiency Percentage Calculation
As an example, if 100,000 Bytes were sent and 2,000 had to be As an example, if 100,000 Bytes were sent and 2,000 had to be
retransmitted, the TCP Efficiency Percentage would be calculated as: retransmitted, the TCP Efficiency Percentage would be calculated as:
102,000 - 2,000 102,000 - 2,000
TCP Efficiency % = ----------------- x 100 = 98.03% TCP Efficiency % = ----------------- X 100 = 98.03%
102,000 102,000
Note that the Retransmitted Bytes may have occurred more than once, Note that the Retransmitted Bytes may have occurred more than once;
if so, then these multiple retransmissions are added to the if so, then these multiple retransmissions are added to the
Retransmitted Bytes and to the Transmitted Bytes counts. Retransmitted Bytes and to the Transmitted Bytes counts.
4.3 Buffer Delay 4.3. Buffer Delay
The third metric is the Buffer Delay Percentage, which represents The third metric is the Buffer Delay Percentage, which represents the
the increase in RTT during a TCP Throughput test versus the inherent increase in RTT during a TCP Throughput test versus the inherent or
or baseline RTT. The baseline RTT is the Round Trip Time inherent to baseline RTT. The baseline RTT is the Round-Trip Time inherent to
the network path under non-congested conditions as defined in Section the network path under non-congested conditions as defined in
3.2.1. The average RTT is derived from the total of all measured Section 3.2.1. The average RTT is derived from the total of all
RTTs during the actual test at every second divided by the test measured RTTs during the actual test at every second divided by the
duration in seconds. test duration in seconds.
Total RTTs during transfer Total RTTs during transfer
Average RTT during transfer = ----------------------------- Average RTT during transfer = -----------------------------
Transfer duration in seconds Transfer duration in seconds
Average RTT during Transfer - Baseline RTT Average RTT during transfer - Baseline RTT
Buffer Delay % = ------------------------------------------ X 100 Buffer Delay % = ------------------------------------------ X 100
Baseline RTT Baseline RTT
4.3.1 Buffer Delay calculation 4.3.1. Buffer Delay Percentage Calculation
As an example, consider a network path with a baseline RTT of 25 ms. As an example, consider a network path with a baseline RTT of 25 ms.
During the course of a TCP transfer, the average RTT across During the course of a TCP transfer, the average RTT across the
the entire transfer increases to 32 ms. Then, the Buffer Delay entire transfer increases to 32 ms. Then, the Buffer Delay
Percentage would be calculated as: Percentage would be calculated as:
32 - 25 32 - 25
Buffer Delay % = ------- x 100 = 28% Buffer Delay % = ------- X 100 = 28%
25 25
Note that the TCP Transfer Time Ratio, TCP Efficiency Percentage, and Note that the TCP Transfer Time Ratio, TCP Efficiency Percentage, and
the Buffer Delay Percentage MUST all be measured during each the Buffer Delay Percentage MUST all be measured during each
throughput test. Poor TCP Transfer Time Ratio (i.e. TCP Transfer throughput test. A poor TCP Transfer Time Ratio (i.e., Actual TCP
Time greater than the Ideal TCP Transfer Time) may be diagnosed by Transfer Time greater than the Ideal TCP Transfer Time) may be
correlating with sub-optimal TCP Efficiency Percentage and/or Buffer diagnosed by correlating with sub-optimal TCP Efficiency Percentage
Delay Percentage metrics. and/or Buffer Delay Percentage metrics.
5. Conducting TCP Throughput Tests 5. Conducting TCP Throughput Tests
Several TCP tools are currently used in the network world and one of Several TCP tools are currently used in the network world, and one of
the most common is "iperf". With this tool, hosts are installed at the most common is "iperf". With this tool, hosts are installed at
each end of the network path; one acts as client and the other as each end of the network path; one acts as a client and the other as a
a server. The Send Socket Buffer and the TCP RWND Sizes of both server. The Send Socket Buffer and the TCP RWND sizes of both client
client and server can be manually set. The achieved throughput can and server can be manually set. The achieved throughput can then be
then be measured, either uni-directionally or bi-directionally. For measured, either uni-directionally or bi-directionally. For higher-
higher BDP situations in lossy networks (Long Fat Networks (LFNs) or BDP situations in lossy networks (Long Fat Networks (LFNs) or
satellite links, etc.), TCP options such as Selective Acknowledgment satellite links, etc.), TCP options such as Selective Acknowledgment
SHOULD become part of the window size / throughput characterization. SHOULD become part of the window size/throughput characterization.
Host hardware performance must be well understood before conducting Host hardware performance must be well understood before conducting
the tests described in the following sections. A dedicated the tests described in the following sections. A dedicated
communications test instrument will generally be REQUIRED, especially communications test instrument will generally be REQUIRED, especially
for line rates of GigE and 10 GigE. A compliant TCP TTD SHOULD for line rates of GigE and 10 GigE. A compliant TCP TTD SHOULD
provide a warning message when the expected test throughput will provide a warning message when the expected test throughput will
exceed the subscribed customer SLA. If the throughput test is exceed the subscribed customer SLA. If the throughput test is
expected to exceed the subscribed customer SLA, then the test expected to exceed the subscribed customer SLA, then the test SHOULD
SHOULD be coordinated with the network provider. be coordinated with the network provider.
The TCP Throughput test SHOULD be run over a long enough duration The TCP Throughput test SHOULD be run over a long enough duration to
to properly exercise network buffers (i.e. greater than 30 seconds) properly exercise network buffers (i.e., greater than 30 seconds) and
and SHOULD also characterize performance at different times of day. SHOULD also characterize performance at different times of the day.
5.1 Single versus Multiple TCP Connections 5.1. Single versus Multiple TCP Connections
The decision whether to conduct single or multiple TCP connection The decision whether to conduct single- or multiple-TCP-connection
tests depends upon the size of the BDP in relation to the TCP RWND tests depends upon the size of the BDP in relation to the TCP RWND
configured in the end-user environment. For example, if the BDP for configured in the end-user environment. For example, if the BDP for
a Long Fat Network (LFN) turns out to be 2MB, then it is probably a Long Fat Network (LFN) turns out to be 2 MB, then it is probably
more realistic to test this network path with multiple connections. more realistic to test this network path with multiple connections.
Assuming typical host TCP RWND Sizes of 64 KB (i.e. Windows XP), Assuming typical host TCP RWND sizes of 64 KB (e.g., Windows XP),
using 32 TCP connections would emulate a small office scenario. using 32 TCP connections would emulate a small-office scenario.
The following table is provided to illustrate the relationship The following table is provided to illustrate the relationship
between the TCP RWND and the number of TCP connections required to between the TCP RWND and the number of TCP connections required to
fill the available capacity of a given BDP. For this example, the fill the available capacity of a given BDP. For this example, the
network bandwidth is 500 Mbps and the RTT is 5 ms, then the BDP network bandwidth is 500 Mbps and the RTT is 5 ms; then, the BDP
equates to 312.5 KBytes. equates to 312.5 KBytes.
Table 5.1 Number of TCP connections versus TCP RWND Number of TCP Connections
TCP RWND to fill available bandwidth
--------------------------------------
16 KB 20
32 KB 10
64 KB 5
128 KB 3
Number of TCP Connections Table 5.1. Number of TCP Connections versus TCP RWND
TCP RWND to fill available bandwidth
-------------------------------------
16KB 20
32KB 10
64KB 5
128KB 3
The TCP Transfer Time Ratio metric is useful when conducting multiple The TCP Transfer Time Ratio metric is useful when conducting
connection tests. Each connection SHOULD be configured to transfer multiple-connection tests. Each connection SHOULD be configured to
payloads of the same size (i.e. 100 MB), then the TCP Transfer Time transfer payloads of the same size (e.g., 100 MB); then, the TCP
Ratio provides a simple metric to verify the actual versus expected Transfer Time Ratio provides a simple metric to verify the actual
results. versus expected results.
Note that the TCP Transfer Time is the time required for each Note that the TCP transfer time is the time required for each
connection to complete the transfer of the predetermined payload connection to complete the transfer of the predetermined payload
size. From the previous table, the 64KB window is considered. Each size. From the previous table, the 64-KB window is considered. Each
of the 5 TCP connections would be configured to transfer 100MB, and of the 5 TCP connections would be configured to transfer 100 MB, and
each one should obtain a maximum of 100 Mbps. So for this example, each one should obtain a maximum of 100 Mbps. So for this example,
the 100MB payload should be transferred across the connections in the 100-MB payload should be transferred across the connections in
approximately 8 seconds (which would be the Ideal TCP Transfer Time approximately 8 seconds (which would be the Ideal TCP Transfer Time
under these conditions). under these conditions).
Additionally, the TCP Efficiency Percentage metric MUST be computed Additionally, the TCP Efficiency Percentage metric MUST be computed
for each connection as defined in Section 4.2. for each connection as defined in Section 4.2.
5.2 Results Interpretation 5.2. Results Interpretation
At the end, a TCP Throughput Test Device (TCP TTD) SHOULD generate a At the end, a TCP Throughput Test Device (TCP TTD) SHOULD generate a
report with the calculated BDP and a set of Window Size experiments. report with the calculated BDP and a set of Window size experiments.
Window Size refers to the minimum of the Send Socket Buffer and TCP Window size refers to the minimum of the Send Socket Buffer and TCP
RWND. The report SHOULD include TCP Throughput results for each TCP RWND. The report SHOULD include TCP Throughput results for each TCP
Window Size tested. The goal is to provide clear achievable versus Window size tested. The goal is to provide achievable versus actual
actual TCP Throughputs results with respect to the TCP Window Size TCP Throughput results with respect to the TCP Window size when no
when no fragmentation occurs. The report SHOULD also include the fragmentation occurs. The report SHOULD also include the results for
results for the 3 metrics defined in Section 4. The goal is to the 3 metrics defined in Section 4. The goal is to provide a clear
provide a clear relationship between these 3 metrics and user relationship between these 3 metrics and user experience. As an
experience. As an example, for the same results in regards with example, for the same results in regard to Transfer Time Ratio, a
Transfer Time Ratio, a better TCP Efficiency could be obtained at the better TCP Efficiency could be obtained at the cost of higher Buffer
cost of higher Buffer Delays. Delays.
For cases where the test results are not equal to the ideal values, For cases where the test results are not equal to the ideal values,
some possible causes are: some possible causes are as follows:
- Network congestion causing packet loss which may be inferred from - Network congestion causing packet loss, which may be inferred from
a poor TCP Efficiency % (i.e., higher TCP Efficiency % = less packet a poor TCP Efficiency % (i.e., higher TCP Efficiency % = less
loss) packet loss).
- Network congestion causing an increase in RTT which may be inferred - Network congestion causing an increase in RTT, which may be
from the Buffer Delay Percentage (i.e., 0% = no increase in RTT over inferred from the Buffer Delay Percentage (i.e., 0% = no increase
baseline) in RTT over baseline).
- Intermediate network devices which actively regenerate the TCP
connection and can alter TCP RWND Size, MTU, etc. - Intermediate network devices that actively regenerate the TCP
connection and can alter TCP RWND size, MTU, etc.
- Rate limiting by policing instead of shaping. - Rate limiting by policing instead of shaping.
- Maximum TCP Buffer space. All operating systems have a global - Maximum TCP Buffer Space. All operating systems have a global
mechanism to limit the quantity of system memory to be used by TCP mechanism to limit the quantity of system memory to be used by TCP
connections. On some systems, each connection is subject to a memory connections. On some systems, each connection is subject to a
limit that is applied to the total memory used for input data, output memory limit that is applied to the total memory used for input
data and controls. On other systems, there are separate limits for data, output data, and controls. On other systems, there are
input and output buffer spaces per connection. Client/server IP separate limits for input and output buffer spaces per connection.
hosts might be configured with Maximum Buffer Space limits that are Client/server IP hosts might be configured with Maximum TCP Buffer
far too small for high performance networks. Space limits that are far too small for high-performance networks.
- Socket Buffer Sizes. Most operating systems support separate per - Socket Buffer sizes. Most operating systems support separate
connection send and receive buffer limits that can be adjusted as per-connection send and receive buffer limits that can be adjusted
long as they stay within the maximum memory limits. These socket as long as they stay within the maximum memory limits. These
buffers MUST be large enough to hold a full BDP of TCP Bytes plus socket buffers MUST be large enough to hold a full BDP of TCP Bytes
some overhead. There are several methods that can be used to adjust plus some overhead. There are several methods that can be used to
socket buffer sizes, but TCP Auto-Tuning automatically adjusts these adjust Socket Buffer sizes, but TCP Auto-Tuning automatically
as needed to optimally balance TCP performance and memory usage. adjusts these as needed to optimally balance TCP performance and
memory usage.
It is important to note that Auto-Tuning is enabled by default in It is important to note that Auto-Tuning is enabled by default in
LINUX since the kernel release 2.6.6 and in UNIX since FreeBSD 7.0. LINUX since kernel release 2.6.6 and in UNIX since FreeBSD 7.0. It
It is also enabled by default in Windows since Vista and in MAC since is also enabled by default in Windows since Vista and in Mac since
OS X version 10.5 (leopard). Over buffering can cause some OS X version 10.5 (Leopard). Over-buffering can cause some
applications to behave poorly, typically causing sluggish interactive applications to behave poorly, typically causing sluggish
response and risk running the system out of memory. Large default interactive response and introducing the risk of running the system
socket buffers have to be considered carefully on multi-user systems. out of memory. Large default socket buffers have to be considered
carefully on multi-user systems.
- TCP Window Scale Option, [RFC1323]. This option enables TCP to - TCP Window Scale option [RFC1323]. This option enables TCP to
support large BDP paths. It provides a scale factor which is support large BDP paths. It provides a scale factor that is
required for TCP to support window sizes larger than 64KB. Most required for TCP to support window sizes larger than 64 KB. Most
systems automatically request WSCALE under some conditions, such as systems automatically request WSCALE under some conditions, such as
when the receive socket buffer is larger than 64KB or when the other when the Receive Socket Buffer is larger than 64 KB or when the
end of the TCP connection requests it first. WSCALE can only be other end of the TCP connection requests it first. WSCALE can only
negotiated during the 3 way handshake. If either end fails to be negotiated during the 3-way handshake. If either end fails to
request WSCALE or requests an insufficient value, it cannot be request WSCALE or requests an insufficient value, it cannot be
renegotiated. Different systems use different algorithms to select renegotiated. Different systems use different algorithms to select
WSCALE, but it is very important to have large enough buffer WSCALE, but it is very important to have large enough buffer sizes.
sizes. Note that under these constraints, a client application Note that under these constraints, a client application wishing to
wishing to send data at high rates may need to set its own receive send data at high rates may need to set its own receive buffer to
buffer to something larger than 64K Bytes before it opens the something larger than 64 KBytes before it opens the connection, to
connection to ensure that the server properly negotiates WSCALE. ensure that the server properly negotiates WSCALE. A system
A system administrator might have to explicitly enable [RFC1323] administrator might have to explicitly enable [RFC1323] extensions.
extensions. Otherwise, the client/server IP host would not support Otherwise, the client/server IP host would not support TCP Window
TCP window sizes (BDP) larger than 64KB. Most of the time, sizes (BDP) larger than 64 KB. Most of the time, performance gains
performance gains will be obtained by enabling this option in LFNs. will be obtained by enabling this option in LFNs.
- TCP Timestamps Option, [RFC1323]. This feature provides better - TCP Timestamps option [RFC1323]. This feature provides better
measurements of the Round Trip Time and protects TCP from data measurements of the Round-Trip Time and protects TCP from data
corruption that might occur if packets are delivered so late that the corruption that might occur if packets are delivered so late that
sequence numbers wrap before they are delivered. Wrapped sequence the sequence numbers wrap before they are delivered. Wrapped
numbers do not pose a serious risk below 100 Mbps, but the risk sequence numbers do not pose a serious risk below 100 Mbps, but the
increases at higher data rates. Most of the time, performance gains risk increases at higher data rates. Most of the time, performance
will be obtained by enabling this option in Gigabit bandwidth gains will be obtained by enabling this option in Gigabit-bandwidth
networks. networks.
- TCP Selective Acknowledgments Option (SACK), [RFC2018]. This allows - TCP Selective Acknowledgments (SACK) option [RFC2018]. This allows
a TCP receiver to inform the sender about exactly which data segment a TCP receiver to inform the sender about exactly which data
is missing and needs to be retransmitted. Without SACK, TCP has to segment is missing and needs to be retransmitted. Without SACK,
estimate which data segment is missing, which works just fine if all TCP has to estimate which data segment is missing, which works just
losses are isolated (i.e. only one loss in any given round trip). fine if all losses are isolated (i.e., only one loss in any given
Without SACK, TCP takes a very long time to recover after multiple round trip). Without SACK, TCP takes a very long time to recover
and consecutive losses. SACK is now supported by most operating after multiple and consecutive losses. SACK is now supported by
systems, but it may have to be explicitly enabled by the system most operating systems, but it may have to be explicitly enabled by
administrator. In networks with unknown load and error patterns, TCP the system administrator. In networks with unknown load and error
SACK will improve throughput performances. On the other hand, patterns, TCP SACK will improve throughput performance. On the
security appliances vendors might have implemented TCP randomization other hand, security appliance vendors might have implemented TCP
without considering TCP SACK and under such circumstances, SACK might randomization without considering TCP SACK, and under such
need to be disabled in the client/server IP hosts until the vendor circumstances, SACK might need to be disabled in the client/server
corrects the issue. Also, poorly implemented SACK algorithms might IP hosts until the vendor corrects the issue. Also, poorly
cause extreme CPU loads and might need to be disabled. implemented SACK algorithms might cause extreme CPU loads and might
need to be disabled.
- Path MTU. The client/server IP host system SHOULD use the largest - Path MTU. The client/server IP host system SHOULD use the largest
possible MTU for the path. This may require enabling Path MTU possible MTU for the path. This may require enabling Path MTU
Discovery [RFC1191] & [RFC4821]. Since [RFC1191] is flawed, it is Discovery [RFC1191] and [RFC4821]. Since [RFC1191] is flawed, Path
sometimes not enabled by default and may need to be explicitly MTU Discovery is sometimes not enabled by default and may need to
enabled by the system administrator. [RFC4821] describes a new, more be explicitly enabled by the system administrator. [RFC4821]
robust algorithm for MTU discovery and ICMP black hole recovery. describes a new, more robust algorithm for MTU discovery and ICMP
black hole recovery.
- TOE (TCP Offload Engine). Some recent Network Interface Cards (NIC) - TOE (TCP Offload Engine). Some recent Network Interface Cards
are equipped with drivers that can do part or all of the TCP/IP (NICs) are equipped with drivers that can do part or all of the
protocol processing. TOE implementations require additional work TCP/IP protocol processing. TOE implementations require additional
(i.e. hardware-specific socket manipulation) to set up and tear down work (i.e., hardware-specific socket manipulation) to set up and
connections. Because TOE NICs configuration parameters are vendor tear down connections. Because TOE NIC configuration parameters
specific and not necessarily RFC-compliant, they are poorly are vendor-specific and not necessarily RFC-compliant, they are
integrated with UNIX & LINUX. Occasionally, TOE might need to be poorly integrated with UNIX and LINUX. Occasionally, TOE might
disabled in a server because its NIC does not have enough memory need to be disabled in a server because its NIC does not have
resources to buffer thousands of connections. enough memory resources to buffer thousands of connections.
Note that both ends of a TCP connection MUST be properly tuned. Note that both ends of a TCP connection MUST be properly tuned.
6. Security Considerations 6. Security Considerations
Measuring TCP network performance raises security concerns. Metrics Measuring TCP network performance raises security concerns. Metrics
produced within this framework may create security issues. produced within this framework may create security issues.
6.1 Denial of Service Attacks 6.1. Denial-of-Service Attacks
TCP network performance metrics, as defined in this document attempts TCP network performance metrics, as defined in this document, attempt
to fill the NUT with a stateful connection. However, since the test to fill the NUT with a stateful connection. However, since the test
MAY use stateless IP streams as specified in Section 3.2.2, it might MAY use stateless IP streams as specified in Section 3.2.2, it might
appear to network operators as a Denial Of Service attack. Thus, as appear to network operators to be a denial-of-service attack. Thus,
mentioned at the beginning of section 3, TCP Throughput testing may as mentioned at the beginning of Section 3, TCP Throughput testing
require cooperation between the end-user customer and the network may require cooperation between the end-user customer and the network
provider. provider.
6.2 User data confidentiality 6.2. User Data Confidentiality
Metrics within this framework generate packets from a sample, rather Metrics within this framework generate packets from a sample, rather
than taking samples based on user data. Thus, our framework does not than taking samples based on user data. Thus, our framework does not
threaten user data confidentiality. threaten user data confidentiality.
6.3 Interference with metrics 6.3. Interference with Metrics
The security considerations that apply to any active measurement of The security considerations that apply to any active measurement of
live networks are relevant here as well. See [RFC4656] and live networks are relevant here as well. See [RFC4656] and
[RFC5357]. [RFC5357].
7. IANA Considerations 7. Acknowledgments
This document does not REQUIRE an IANA registration for ports
dedicated to the TCP testing described in this document.
8. Acknowledgments
Thanks to Lars Eggert, Al Morton, Matt Mathis, Matt Zekauskas,
Yaakov Stein, and Loki Jorgenson for many good comments and for
pointing us to great sources of information pertaining to past works
in the TCP capacity area.
9. References
9.1 Normative References Thanks to Lars Eggert, Al Morton, Matt Mathis, Matt Zekauskas, Yaakov
Stein, and Loki Jorgenson for many good comments and for pointing us
to great sources of information pertaining to past works in the TCP
capacity area.
[RFC1191] Mogul, A., Deering, S., "Path MTU Discovery", 1990 8. Normative References
[RFC1323] Jacobson, V., Braden, R., Borman D., "TCP Extensions for [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
High Performance", May 1992 November 1990.
[RFC2018] Mathis, M., Mahdavi, J., Floyd, S., Romanow, A., "TCP [RFC1323] Jacobson, V., Braden, R., and D. Borman, "TCP Extensions
Selective Acknowledgment Options", 1996 for High Performance", RFC 1323, May 1992.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
Requirement Levels", BCP 14, RFC 2119, March 1997. Selective Acknowledgment Options", RFC 2018,
October 1996.
[RFC2544] Bradner, S., McQuaid, J., "Benchmarking Methodology for [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Network Interconnect Devices", RFC 2544, June 1999 Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC4656] Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M. [RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for
Zekauskas, "A One-way Active Measurement Protocol Network Interconnect Devices", RFC 2544, March 1999.
(OWAMP)", RFC 4656, September 2006.
[RFC4821] Mathis, M., Heffner, J., "Packetization Layer Path MTU [RFC4656] Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M.
Discovery", RFC 4821, June 2007 Zekauskas, "A One-way Active Measurement Protocol
(OWAMP)", RFC 4656, September 2006.
[RFC4898] Mathis, M., Heffner, J., Raghunarayan, R., "TCP Extended [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU
Statistics MIB", May 2007 Discovery", RFC 4821, March 2007.
[RFC5136] Chimento P., Ishac, J., "Defining Network Capacity", [RFC4898] Mathis, M., Heffner, J., and R. Raghunarayan, "TCP
February 2008 Extended Statistics MIB", RFC 4898, May 2007.
[RFC5357] Hedayat, K., Krzanowski, R., Morton, A., Yum, K., Babiarz, [RFC5136] Chimento, P. and J. Ishac, "Defining Network Capacity",
J., "A Two-Way Active Measurement Protocol (TWAMP)", RFC 5136, February 2008.
RFC 5357, October 2008
draft-ietf-ippm-btc-cap-00.txt Allman, M., "A Bulk [RFC5357] Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J.
Transfer Capacity Methodology for Cooperating Hosts", Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)",
August 2001 RFC 5357, October 2008.
9.2. Informative References
Authors' Addresses Authors' Addresses
Barry Constantine Barry Constantine
JDSU, Test and Measurement Division JDSU, Test and Measurement Division
One Milesone Center Court One Milesone Center Court
Germantown, MD 20876-7100 Germantown, MD 20876-7100
USA USA
Phone: +1 240 404 2227 Phone: +1 240 404 2227
barry.constantine@jdsu.com EMail: barry.constantine@jdsu.com
Gilles Forget Gilles Forget
Independent Consultant to Bell Canada. Independent Consultant to Bell Canada
308, rue de Monaco, St-Eustache 308, rue de Monaco, St-Eustache
Qc. CANADA, Postal Code: J7P-4T5 Qc. J7P-4T5 CANADA
Phone: (514) 895-8212 Phone: (514) 895-8212
gilles.forget@sympatico.ca EMail: gilles.forget@sympatico.ca
Ruediger Geib Ruediger Geib
Heinrich-Hertz-Strasse (Number: 3-7) Heinrich-Hertz-Strasse 3-7
Darmstadt, Germany, 64295 Darmstadt, 64295 Germany
Phone: +49 6151 6282747 Phone: +49 6151 5812747
Ruediger.Geib@telekom.de EMail: Ruediger.Geib@telekom.de
Reinhard Schrage Reinhard Schrage
Osterende 7
Seelze, 30926
Germany
Schrage Consulting Schrage Consulting
Phone: +49 (0) 5137 909540 Phone: +49 (0) 5137 909540
reinhard@schrageconsult.com EMail: reinhard@schrageconsult.com
 End of changes. 231 change blocks. 
719 lines changed or deleted 756 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/