draft-ietf-ippm-tcp-throughput-tm-10.txt   draft-ietf-ippm-tcp-throughput-tm-11.txt 
Network Working Group B. Constantine Network Working Group B. Constantine
Internet-Draft JDSU Internet-Draft JDSU
Intended status: Informational G. Forget Intended status: Informational G. Forget
Expires: July 2, 2011 Bell Canada (Ext. Consultant) Expires: July 31, 2011 Bell Canada (Ext. Consultant)
Rudiger Geib Rudiger Geib
Deutsche Telekom Deutsche Telekom
Reinhard Schrage Reinhard Schrage
Schrage Consulting Schrage Consulting
January 2, 2011 January 31, 2011
Framework for TCP Throughput Testing Framework for TCP Throughput Testing
draft-ietf-ippm-tcp-throughput-tm-10.txt draft-ietf-ippm-tcp-throughput-tm-11.txt
Abstract Abstract
This framework describes a methodology for measuring end-to-end TCP This framework describes a practical methodology for measuring end-
throughput performance in a managed IP network. The intention is to to-end TCP throughput in a managed IP network. The goal is to provide
provide a practical methodology to validate TCP layer performance. a better indication in regards to user experience. In this framework,
The goal is to provide a better indication of the user experience. TCP and IP parameters are specified and should be configured as
In this framework, various TCP and IP parameters are identified and recommended.
should be tested as part of a managed IP network.
Requirements Language Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119]. document are to be interpreted as described in RFC 2119 [RFC2119].
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
skipping to change at page 1, line 46 skipping to change at page 1, line 45
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on July 2, 2011. This Internet-Draft will expire on July 31, 2011.
Copyright Notice Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 33 skipping to change at page 2, line 33
1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Test Set-up . . . . . . . . . . . . . . . . . . . . . . . 5 1.2 Test Set-up . . . . . . . . . . . . . . . . . . . . . . . 5
2. Scope and Goals of this methodology. . . . . . . . . . . . . . 5 2. Scope and Goals of this methodology. . . . . . . . . . . . . . 5
2.1 TCP Equilibrium. . . . . . . . . . . . . . . . . . . . . . 6 2.1 TCP Equilibrium. . . . . . . . . . . . . . . . . . . . . . 6
3. TCP Throughput Testing Methodology . . . . . . . . . . . . . . 7 3. TCP Throughput Testing Methodology . . . . . . . . . . . . . . 7
3.1 Determine Network Path MTU . . . . . . . . . . . . . . . . 9 3.1 Determine Network Path MTU . . . . . . . . . . . . . . . . 9
3.2. Baseline Round Trip Time and Bandwidth . . . . . . . . . . 10 3.2. Baseline Round Trip Time and Bandwidth . . . . . . . . . . 10
3.2.1 Techniques to Measure Round Trip Time . . . . . . . . 11 3.2.1 Techniques to Measure Round Trip Time . . . . . . . . 11
3.2.2 Techniques to Measure end-to-end Bandwidth. . . . . . 12 3.2.2 Techniques to Measure end-to-end Bandwidth. . . . . . 12
3.3. TCP Throughput Tests . . . . . . . . . . . . . . . . . . . 12 3.3. TCP Throughput Tests . . . . . . . . . . . . . . . . . . . 12
3.3.1 Calculate Ideal maximum TCP RWIN Size. . . . . . . . . 12 3.3.1 Calculate minimum required TCP RWND Size. . . . . . . 12
3.3.2 Metrics for TCP Throughput Tests . . . . . . . . . . . 15 3.3.2 Metrics for TCP Throughput Tests . . . . . . . . . . . 15
3.3.3 Conducting the TCP Throughput Tests. . . . . . . . . . 19 3.3.3 Conducting the TCP Throughput Tests. . . . . . . . . . 19
3.3.4 Single vs. Multiple TCP Connection Testing . . . . . . 19 3.3.4 Single vs. Multiple TCP Connection Testing . . . . . . 19
3.3.5 Interpretation of the TCP Throughput Results . . . . . 20 3.3.5 Interpretation of the TCP Throughput Results . . . . . 20
3.3.6 High Performance Network Options . . . . . . . . . . . 20 3.3.6 High Performance Network Options . . . . . . . . . . . 20
3.4. Traffic Management Tests . . . . . . . . . . . . . . . . . 22 3.4. Traffic Management Tests . . . . . . . . . . . . . . . . . 22
3.4.1 Traffic Shaping Tests. . . . . . . . . . . . . . . . . 23 3.4.1 Traffic Shaping Tests. . . . . . . . . . . . . . . . . 23
3.4.1.1 Interpretation of Traffic Shaping Test Results. . . 23 3.4.1.1 Interpretation of Traffic Shaping Test Results. . . 23
3.4.2 RED Tests. . . . . . . . . . . . . . . . . . . . . . . 24 3.4.2 AQM Tests. . . . . . . . . . . . . . . . . . . . . . . 24
3.4.2.1 Interpretation of RED Results . . . . . . . . . . . 25 3.4.2.1 Interpretation of AQM Results . . . . . . . . . . . 25
4. Security Considerations . . . . . . . . . . . . . . . . . . . 25 4. Security Considerations . . . . . . . . . . . . . . . . . . . 26
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 26
6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 26 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 26
7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26
7.1 Normative References . . . . . . . . . . . . . . . . . . . 26 7.1 Normative References . . . . . . . . . . . . . . . . . . . 26
7.2 Informative References . . . . . . . . . . . . . . . . . . 26 7.2 Informative References . . . . . . . . . . . . . . . . . . 27
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27
1. Introduction 1. Introduction
The SLA (Service Level Agreement) provided to business class
customers is generally based upon Layer 2/3 criteria such as :
Guaranteed bandwidth, maximum network latency, maximum packet loss
percentage and maximum delay variation (i.e. maximum jitter).
Network providers are coming to the realization that Layer 2/3 Network providers are coming to the realization that Layer 2/3
testing is not enough to adequately ensure end-user's satisfaction. testing is not enough to adequately ensure end-user's satisfaction.
An SLA (Service Level Agreement) is provided to business customers In addition to Layer 2/3 performance, measuring TCP throughput
and is generally based upon Layer 2/3 criteria such as access rate, provides more meaningful results with respect to user experience.
latency, packet loss and delay variations. On the other hand,
measuring TCP throughput provides meaningful results with respect to
user experience. Thus, the network provider community desires to
measure IP network throughput performance at the TCP layer.
Additionally, business enterprise customers seek to conduct Additionally, business class customers seek to conduct repeatable TCP
repeatable TCP throughput tests between locations. Since these throughput tests between locations. Since these organizations rely on
enterprises rely on the networks of the providers, a common test the networks of the providers, a common test methodology with
methodology with predefined metrics will benefit both parties. predefined metrics would benefit both parties.
Note that the primary focus of this methodology is managed business Note that the primary focus of this methodology is managed business
class IP networks; i.e. those Ethernet terminated services for which class IP networks; i.e. those Ethernet terminated services for which
businesses are provided an SLA from the network provider. End-users organizations are provided an SLA from the network provider. Because
with "best effort" access between locations can use this methodology, of the SLA, the expectation is that the TCP Throughput should achieve
but this framework and its metrics are intended to be used in a the guaranteed bandwidth. End-users with "best effort" access could
predictable managed IP service environment. use this methodology, but this framework and its metrics are intended
to be used in a predictable managed IP network. No end-to-end
performance can be guaranteed when only the access portion is being
provisioned to a specific bandwidth capacity.
So the intent behind this document is to define a methodology for The intent behind this document is to define a methodology for
testing sustained TCP layer performance. In this document, the testing sustained TCP layer performance. In this document, the
maximum achievable TCP Throughput is that amount of data per unit achievable TCP Throughput is that amount of data per unit time that
time that TCP transports when trying to reach Equilibrium, i.e. TCP transports when in the TCP Equilibrium state. (See section 2.1
after the initial slow start and congestion avoidance phases. for TCP Equilibrium definition). Throughout this document, maximum
achievable throughput refers to the theoretical achievable throughput
when TCP is in the Equilibrium state.
TCP is connection oriented and at the transmitting side of the TCP is connection oriented and at the transmitting side it uses a
connection it uses a congestion window, (TCP CWND), to determine how congestion window, (TCP CWND). At the receiving end, TCP uses a
many packets it can send at one time. The network path bandwidth receive window, (TCP RWND) to inform the transmitting end on how
delay product (BDP) determines the ideal TCP CWND. With the help of many Bytes it is capable to accept at a given time.
slow start and congestion avoidance mechanisms, TCP probes the IP
network path. So up to the bandwidth limit, a larger TCP CWND permits
a higher throughput. And up to local host limits, TCP "Slow Start"
and "Congestion Avoidance" algorithms together will determine the TCP
CWND size. This TCP CWND will vary during the session, but the
Maximum TCP CWND size is tributary to the buffer space allocated by
the kernel for each socket.
At the receiving end of the connection, TCP uses a receive window, Derived from Round Trip Time (RTT) and network path bandwidth, the
(TCP RWIN), to inform the transmitting end on how many Bytes it is bandwidth delay product (BDP) determines the Send and Received Socket
capabable to receive between acknowledgements (TCP ACK). This TCP buffers sizes required to achieve the maximum TCP throughput. Then,
RWIN will also vary during the session and the Maximum TCP RWIN Size with the help of slow start and congestion avoidance algorithms, a
is also tributary to the buffer space allocated by the kernel for TCP CWND is calculated based on the IP network path loss rate.
each socket. Finally, the minimum value between the calculated TCP CWND and the
TCP RWND advertised by the opposite end will determine how many Bytes
can actually be sent by the transmitting side at a given time.
At both end of the TCP connection and for each socket, there are Both TCP Window sizes (RWND and CWND) may vary during any given TCP
default buffer sizes that can be changed by programs using system session, although up to bandwidth limits, larger RWND and larger CWND
libraries called just before opening the socket. There are also will achieve higher throughputs by permitting more in-flight Bytes.
kernel enforced maximum buffer sizes. These buffer sizes can be
adjusted at both ends (transmitting and receiving). In order to
obtain the maximum throughput, it is critical to use optimal TCP
Send and Receive Socket Buffer sizes.
Note that some TCP/IP stack implementations are using Receive Window At both ends of the TCP connection and for each socket, there are
Auto-Tuning and cannot be adjusted until this feature is disabled. default buffer sizes. There are also kernel enforced maximum buffer
sizes. These buffer sizes can be adjusted at both ends (transmitting
and receiving). Some TCP/IP stack implementations use Receive Window
Auto-Tuning, although in order to obtain the maximum throughput it is
critical to use large enough TCP Send and Receive Socket Buffer
sizes. In fact, they should be equal to or greater than BDP.
There are many variables to consider when conducting a TCP throughput Many variables are involved in TCP throughput performance, but this
test, but this methodology focuses on: methodology focuses on:
- RTT and Bottleneck BW - BB (Bottleneck Bandwidth)
- Ideal Send Socket Buffer (Ideal maximum TCP CWND) - RTT (Round Trip Time)
- Ideal Receive Socket Buffer (Ideal maximum TCP RWIN) - Send and Receive Socket Buffers
- Path MTU and Maximum Segment Size (MSS) - Minimum TCP RWND
- Single Connection and Multiple Connections testing - Path MTU (Maximum Transmission Unit)
- Path MSS (Maximum Segment Size)
This methodology proposes TCP testing that should be performed in This methodology proposes TCP testing that should be performed in
addition to traditional Layer 2/3 type tests. Layer 2/3 tests are addition to traditional Layer 2/3 type tests. In fact, Layer 2/3
required to verify the integrity of the network before conducting TCP tests are required to verify the integrity of the network before
tests. Examples include iperf (UDP mode) or manual packet layer test conducting TCP tests. Examples include iperf (UDP mode) and manual
techniques where packet throughput, loss, and delay measurements are packet layer test techniques where packet throughput, loss, and delay
conducted. When available, standardized testing similar to RFC 2544 measurements are conducted. When available, standardized testing
[RFC2544] but adapted for use in operational networks may be used. similar to [RFC2544] but adapted for use in operational networks may
be used.
Note: RFC 2544 was never meant to be used outside a lab environment. Note: RFC 2544 was never meant to be used outside a lab environment.
The following 2 sections provide a general overview of the test Sections 2 and 3 of this document provide a general overview of the
methodology. proposed methodology.
1.1 Terminology 1.1 Terminology
Common terminologies used in the test methodology are: The common definitions used in this methodology are:
- TCP Throughput Test Device (TCP TTD), refers to compliant TCP - TCP Throughput Test Device (TCP TTD), refers to compliant TCP
host that generates traffic and measures metrics as defined in host that generates traffic and measures metrics as defined in
this methodology. i.e. a dedicated communications test instrument. this methodology. i.e. a dedicated communications test instrument.
- Customer Provided Equipment (CPE), refers to customer owned - Customer Provided Equipment (CPE), refers to customer owned
equipment (routers, switches, computers, etc.) equipment (routers, switches, computers, etc.)
- Customer Edge (CE), refers to provider owned demarcation device. - Customer Edge (CE), refers to provider owned demarcation device.
- Provider Edge (PE), refers to provider's distribution equipment. - Provider Edge (PE), refers to provider's distribution equipment.
- Bottleneck Bandwidth (BB), lowest bandwidth along the complete - Bottleneck Bandwidth (BB), lowest bandwidth along the complete
path. Bottleneck Bandwidth and Bandwidth are used synonymously path. Bottleneck Bandwidth and Bandwidth are used synonymously
in this document. Most of the time the Bottleneck Bandwidth is in this document. Most of the time the Bottleneck Bandwidth is
in the access portion of the wide area network (CE - PE). in the access portion of the wide area network (CE - PE).
- Provider (P), refers to provider core network equipment. - Provider (P), refers to provider core network equipment.
- Network Under Test (NUT), refers to the tested IP network path. - Network Under Test (NUT), refers to the tested IP network path.
- Round-Trip Time (RTT), refers to Layer 4 back and forth delay. - Round Trip Time (RTT), refers to Layer 4 back and forth delay.
Figure 1.1 Devices, Links and Paths Figure 1.1 Devices, Links and Paths
+----+ +----+ +----+ +----+ +---+ +---+ +----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+ +---+ +---+ +----+ +----+ +----+ +----+
| TCP|-| CPE|-| CE |--| PE |-| P |--| P |-| PE |--| CE |-| CPE|-| TCP| | TCP|-| CPE|-| CE |--| PE |-| P |--| P |-| PE |--| CE |-| CPE|-| TCP|
| TTD| | | | |BB| | | | | | | |BB| | | | | TTD| | TTD| | | | |BB| | | | | | | |BB| | | | | TTD|
+----+ +----+ +----+ +----+ +---+ +---+ +----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+ +---+ +---+ +----+ +----+ +----+ +----+
<------------------------ NUT -------------------------> <------------------------ NUT ------------------------->
R >-----------------------------------------------------------| R >-----------------------------------------------------------|
T | T |
T <-----------------------------------------------------------| T <-----------------------------------------------------------|
Note that the NUT may consist of a variety of devices including but Note that the NUT may be built with of a variety of devices including
not limited to, load balancers, proxy servers or WAN acceleration but not limited to, load balancers, proxy servers or WAN acceleration
devices. The detailed topology of the NUT should be well understood appliances. The detailed topology of the NUT should be well known
when conducting the TCP throughput tests, although this methodology when conducting the TCP throughput tests, although this methodology
makes no attempt to characterize specific network architectures. makes no attempt to characterize specific network architectures.
1.2 Test Set-up 1.2 Test Set-up
This methodology is intended for operational and managed IP networks. This methodology is intended for operational and managed IP networks.
A multitude of network architectures and topologies can be tested. A multitude of network architectures and topologies can be tested.
The above set-up diagram is very general and it only illustrates the The above diagram is very general and is only there to illustrate
segmentation within end-user and network provider domains. typical segmentation within end-user and network provider domains.
2. Scope and Goals of this Methodology 2. Scope and Goals of this Methodology
Before defining the goals, it is important to clearly define the Before defining the goals, it is important to clearly define the
areas that are out-of-scope. areas that are out-of-scope.
- This methodology is not intended to predict the TCP throughput - This methodology is not intended to predict the TCP throughput
during the transient stages of a TCP connection, such as the initial during the transient stages of a TCP connection, such as during the
slow start. initial slow start phase.
- This methodology is not intended to definitively benchmark TCP - This methodology is not intended to definitively benchmark TCP
implementations of one OS to another, although some users may find implementations of one OS to another, although some users may find
some value in conducting qualitative experiments. value in conducting qualitative experiments.
- This methodology is not intended to provide detailed diagnosis - This methodology is not intended to provide detailed diagnosis
of problems within end-points or within the network itself as of problems within end-points or within the network itself as
related to non-optimal TCP performance, although a results related to non-optimal TCP performance, although a results
interpretation section for each test step may provide insight in interpretation section for each test step may provide insights to
regards with potential issues. potential issues.
- This methodology does not propose to operate permanently with high - This methodology does not propose to operate permanently with high
measurement loads. TCP performance and optimization within measurement loads. TCP performance and optimization within
operational networks may be captured and evaluated by using data operational networks may be captured and evaluated by using data
from the "TCP Extended Statistics MIB" [RFC4898]. from the "TCP Extended Statistics MIB" [RFC4898].
- This methodology is not intended to measure TCP throughput as part - This methodology is not intended to measure TCP throughput as part
of an SLA, or to compare the TCP performance between service of an SLA, or to compare the TCP performance between service
providers or to compare between implementations of this methodology providers or to compare between implementations of this methodology
in dedicated communications test instruments. in dedicated communications test instruments.
In contrast to the above exclusions, a primary goal is to define a In contrast to the above exclusions, the primary goal is to define a
method to conduct a practical, end-to-end assessment of sustained method to conduct a practical end-to-end assessment of sustained
TCP performance within a managed business class IP network. Another TCP performance within a managed business class IP network. Another
key goal is to establish a set of "best practices" that a non-TCP key goal is to establish a set of "best practices" that a non-TCP
expert should apply when validating the ability of a managed network expert should apply when validating the ability of a managed IP
to carry end-user TCP applications. network to carry end-user TCP applications.
Specific goals are to : Specific goals are to :
- Provide a practical test approach that specifies tunable parameters - Provide a practical test approach that specifies tunable parameters
such as MSS (Maximum Segment Size) and Socket Buffer sizes and how (such as MSS (Maximum Segment Size) and Socket Buffer sizes) and how
these affect the outcome of TCP performances over an IP network. these affect the outcome of TCP performances over an IP network.
See section 3.3.3. See section 3.3.3.
- Provide specific test conditions like link speed, RTT, MSS, Socket - Provide specific test conditions like link speed, RTT, MSS, Socket
Buffer sizes and maximum achievable TCP throughput when trying to Buffer sizes and achievable TCP throughput when TCP is in the
reach TCP Equilibrium. For guideline purposes, provide examples of Equilibrium state. For guideline purposes, provide examples of
test conditions and their maximum achievable TCP throughput. test conditions and their maximum achievable TCP throughput.
Section 2.1 provides specific details concerning the definition of Section 2.1 provides specific details concerning the definition of
TCP Equilibrium within this methodology while section 3 provides TCP Equilibrium within this methodology while section 3 provides
specific test conditions with examples. specific test conditions with examples.
- Define three (3) basic metrics to compare the performance of TCP - Define three (3) basic metrics to compare the performance of TCP
connections under various network conditions. See section 3.3.2. connections under various network conditions. See section 3.3.2.
- In test situations where the recommended procedure does not yield - In test situations where the recommended procedure does not yield
the maximum achievable TCP throughput results, this methodology the maximum achievable TCP throughput, this methodology provides
provides some possible areas within the end host or the network that some possible areas within the end host or the network that should
should be considered for investigation. Although again, this be considered for investigation. Although again, this methodology
methodology is not intended to provide a detailed diagnosis on these is not intended to provide detailed diagnosis on these issues.
issues. See section 3.3.5. See section 3.3.5.
2.1 TCP Equilibrium 2.1 TCP Equilibrium
TCP connections have three (3) fundamental congestion window phases: TCP connections have three (3) fundamental congestion window phases:
1 - The Slow Start phase, which occurs at the beginning of a TCP 1 - The Slow Start phase, which occurs at the beginning of a TCP
transmission or after a retransmission time out. transmission or after a retransmission time out.
2 - The Congestion Avoidance phase, during which TCP ramps up to 2 - The Congestion Avoidance phase, during which TCP ramps up to
establish the maximum attainable throughput on an end-to-end network establish the maximum achievable throughput. It is important to note
path. Retransmissions are a natural by-product of the TCP congestion that retransmissions are a natural by-product of the TCP congestion
avoidance algorithm as it seeks to achieve maximum throughput. avoidance algorithm as it seeks to achieve maximum throughput.
3 - The Loss Recovery phase, which could include Fast Retransmit 3 - The Loss Recovery phase, which could include Fast Retransmit
(Tahoe) or Fast Recovery (Reno & New Reno). When packet loss occurs, (Tahoe) or Fast Recovery (Reno & New Reno). When packet loss occurs,
Congestion Avoidance phase transitions either to Fast Retransmission Congestion Avoidance phase transitions either to Fast Retransmission
or Fast Recovery depending upon TCP implementations. If a Time-Out or Fast Recovery depending upon the TCP implementation. If a Time-Out
occurs, TCP transitions back to the Slow Start phase. occurs, TCP transitions back to the Slow Start phase.
The following diagram depicts these 3 phases. The following diagram depicts these 3 phases.
Figure 2.1 TCP CWND Phases Figure 2.1 TCP CWND Phases
/\ | Trying to reach TCP Equilibrium > > > > > > > > > /\ | TCP
/\ | /\ | Equilibrium
/\ |High ssthresh TCP CWND /\ |High ssthresh TCP CWND
/\ |Loss Event * halving 3-Loss Recovery /\ |Loss Event * halving 3-Loss Recovery
/\ | * \ upon loss Adjusted /\ | * \ upon loss Adjusted
/\ | * \ / \ Time-Out ssthresh /\ | * \ / \ Time-Out ssthresh
/\ | * \ / \ +--------+ * /\ | * \ / \ +--------+ *
TCP | * \/ \ / Multiple| * /\ | * \/ \ / Multiple| *
Through- | * 2-Congestion\ / Loss | * /\ | * 2-Congestion\ / Loss | *
put | * Avoidance \/ Event | * /\ | * Avoidance \/ Event | *
| * Half | * TCP | * Half | *
| * TCP CWND | * 1-Slow Start Through- | * TCP CWND | * 1-Slow Start
| * 1-Slow Start Min TCP CWND after T-O put | * 1-Slow Start Min TCP CWND after T-O
+----------------------------------------------------------- +-----------------------------------------------------------
Time > > > > > > > > > > > > > > > Time > > > > > > > > > > > > > > > > > > > > > > > > > > >
Note : ssthresh = Slow Start threshold. Note : ssthresh = Slow Start threshold.
A well tuned and managed IP network with appropriate TCP adjustments A well tuned and managed IP network with appropriate TCP adjustments
in it's IP hosts and applications should perform very close to TCP in the IP hosts and applications should perform very close to the
Equilibrium and to the BB (Bottleneck Bandwidth). BB (Bottleneck Bandwidth) when TCP is in the Equilibrium state.
This TCP methodology provides guidelines to measure the maximum This TCP methodology provides guidelines to measure the maximum
achievable TCP throughput or maximum TCP sustained rate obtained achievable TCP throughput when TCP is in the Equilibrium state.
after TCP CWND has stabilized to an optimal value. All maximum All maximum achievable TCP throughputs specified in section 3 are
achievable TCP throughputs specified in section 3 are with respect to with respect to this condition.
this condition.
It is important to clarify the interaction between the sender's Send It is important to clarify the interaction between the sender's Send
Socket Buffer and the receiver's advertised TCP RWIN Size. TCP test Socket Buffer and the receiver's advertised TCP RWND Size. TCP test
programs such as iperf, ttcp, etc. allow the sender to control the programs such as iperf, ttcp, etc. allows the sender to control the
quantity of TCP Bytes transmitted and unacknowledged (in-flight), quantity of TCP Bytes transmitted and unacknowledged (in-flight),
commonly referred to as the Send Socket Buffer. This is done commonly referred to as the Send Socket Buffer. This is done
independently of the TCP RWIN Size advertised by the independently of the TCP RWND Size advertised by the receiver.
receiver. Implications to the capabilities of the Throughput Test Implications to the capabilities of the Throughput Test Device (TTD)
Device (TTD) are covered at the end of section 3. are covered at the end of section 3.
3. TCP Throughput Testing Methodology 3. TCP Throughput Testing Methodology
As stated earlier in section 1, it is considered best practice to As stated earlier in section 1, it is considered best practice to
verify the integrity of the network by conducting Layer2/3 tests such verify the integrity of the network by conducting Layer 2/3 tests
as [RFC2544] or other methods of network stress tests. Although, it such as [RFC2544] or other methods of network stress tests.
is important to mention here that RFC 2544 was never meant to be used Although, it is important to mention here that RFC 2544 was never
outside a lab environment. meant to be used outside a lab environment.
If the network is not performing properly in terms of packet loss, If the network is not performing properly in terms of packet loss,
jitter, etc. then the TCP layer testing will not be meaningful. A jitter, etc. then the TCP layer testing will not be meaningful. A
dysfunctional network will not acheive optimal TCP throughputs in dysfunctional network will not achieve optimal TCP throughputs in
regards with the available bandwidth. regards with the available bandwidth.
TCP Throughput testing may require cooperation between the end-user TCP Throughput testing may require cooperation between the end-user
customer and the network provider. In a Layer 2/3 VPN architecture, customer and the network provider. As an example, in an MPLS (Multi-
the testing should be conducted either on the CPE or on the CE device Protocol Label Switching) network architecture, the testing should be
and not on the PE (Provider Edge) router. conducted either on the CPE or on the CE device and not on the PE
(Provider Edge) router.
The following represents the sequential order of steps for this The following represents the sequential order of steps for this
testing methodology: testing methodology:
1. Identify the Path MTU. Packetization Layer Path MTU Discovery 1. Identify the Path MTU. Packetization Layer Path MTU Discovery
or PLPMTUD, [RFC4821], MUST be conducted to verify the network path or PLPMTUD, [RFC4821], MUST be conducted to verify the network path
MTU. Conducting PLPMTUD establishes the upper limit for the MSS to MTU. Conducting PLPMTUD establishes the upper limit for the MSS to
be used in subsequent steps. be used in subsequent steps.
2. Baseline Round Trip Time and Bandwidth. This step establishes the 2. Baseline Round Trip Time and Bandwidth. This step establishes the
inherent, non-congested Round Trip Time (RTT) and the bottleneck inherent, non-congested Round Trip Time (RTT) and the Bottleneck
bandwidth of the end-to-end network path. These measurements are Bandwidth of the end-to-end network path. These measurements are
used to provide estimates of the ideal maximum TCP RWIN and Send used to provide estimates of the TCP RWND and Send Socket Buffer
Socket Buffer Sizes that SHOULD be used in subsequent test steps. Sizes that SHOULD be used during subsequent test steps. These
These measurements reference [RFC2681] and [RFC4898] to measure RTD measurements refers to [RFC2681] and [RFC4898] in order to measure
and the associated RTT. RTD and associated RTT.
3. TCP Connection Throughput Tests. With baseline measurements 3. TCP Connection Throughput Tests. With baseline measurements
of Round Trip Time and bottleneck bandwidth, single and multiple TCP of Round Trip Time and Bottleneck Bandwidth, single and multiple TCP
connection throughput tests SHOULD be conducted to baseline network connection throughput tests SHOULD be conducted to baseline network
performance expectations. performances.
4. Traffic Management Tests. Various traffic management and queuing 4. Traffic Management Tests. Various traffic management and queuing
techniques can be tested in this step, using multiple TCP techniques can be tested in this step, using multiple TCP
connections. Multiple connections testing should verify that the connections. Multiple connections testing should verify that the
network is configured properly for traffic shaping versus policing, network is configured properly for traffic shaping versus policing
various queuing implementations and Random Early Discards (RED). and that Active Queue Management implementations are used.
Important to note are some of the key characteristics and Important to note are some of the key characteristics and
considerations for the TCP test instrument. The test host may be a considerations for the TCP test instrument. The test host may be a
standard computer or a dedicated communications test instrument. standard computer or a dedicated communications test instrument.
In both cases, they must be capable of emulating both client and In both cases, it must be capable of emulating both a client and a
server. server.
The following criteria should be considered when selecting whether The following criteria should be considered when selecting whether
the TCP test host can be a standard computer or has to be a dedicated the TCP test host can be a standard computer or has to be a dedicated
communications test instrument: communications test instrument:
- TCP implementation used by the test host, OS version, i.e. Linux OS - TCP implementation used by the test host, OS version, i.e. LINUX OS
kernel using TCP New Reno, TCP options supported, etc. These will kernel using TCP New Reno, TCP options supported, etc. These will
obviously be more important when using dedicated communications test obviously be more important when using dedicated communications test
instruments where the TCP implementation may be customized or tuned instruments where the TCP implementation may be customized or tuned
to run in higher performance hardware. When a compliant TCP TTD is to run in higher performance hardware. When a compliant TCP TTD is
used, the TCP implementation MUST be identified in the test results. used, the TCP implementation MUST be identified in the test results.
The compliant TCP TTD should be usable for complete end-to-end The compliant TCP TTD should be usable for complete end-to-end
testing through network security elements and should also be usable testing through network security elements and should also be usable
for testing network sections. for testing network sections.
- More important, the TCP test host MUST be capable to generate - More important, the TCP test host MUST be capable to generate
and receive stateful TCP test traffic at the full link speed of the and receive stateful TCP test traffic at the full link speed of the
network under test. Stateful TCP test traffic means that the test network under test. Stateful TCP test traffic means that the test
host MUST fully implement a TCP/IP stack; this is generally a comment host MUST fully implement a TCP/IP stack; this is generally a comment
aimed at dedicated communications test equipments which sometimes aimed at dedicated communications test equipments which sometimes
"blast" packets with TCP headers. As a general rule of thumb, testing "blast" packets with TCP headers. As a general rule of thumb, testing
TCP throughput at rates greater than 100 Mbit/sec MAY require high TCP throughput at rates greater than 100 Mbit/sec MAY require high
performance server hardware or dedicated hardware based test tools. performance server hardware or dedicated hardware based test tools.
- A compliant TCP Throughput Test Device MUST allow adjusting both - A compliant TCP Throughput Test Device MUST allow adjusting both
Send and Receive Socket Buffer sizes. The Send Socket Buffer MUST be Send and Receive Socket Buffer sizes. The Socket Buffers MUST be
large enough to accommodate the maximum TCP CWND Size. The Receive large enough to fill the BDP.
Socket Buffer MUST be large enough to accommodate the maximum TCP
RWIN Size.
- Measuring RTT and retransmissions per connection will generally - Measuring RTT and retransmissions per connection will generally
require a dedicated communications test instrument. In the absence of require a dedicated communications test instrument. In the absence of
dedicated hardware based test tools, these measurements may need to dedicated hardware based test tools, these measurements may need to
be conducted with packet capture tools, i.e. conduct TCP throughput be conducted with packet capture tools, i.e. conduct TCP throughput
tests and analyze RTT and retransmission results in packet captures. tests and analyze RTT and retransmissions in packet captures.
Another option may be to use "TCP Extended Statistics MIB" per Another option may be to use "TCP Extended Statistics MIB" per
[RFC4898]. [RFC4898].
- The RFC4821 PLPMTUD test SHOULD be conducted with a dedicated - The RFC4821 PLPMTUD test SHOULD be conducted with a dedicated
tester which exposes the ability to run the PLPMTUD algorithm tester which exposes the ability to run the PLPMTUD algorithm
independent from the OS stack. independently from the OS stack.
3.1. Determine Network Path MTU 3.1. Determine Network Path MTU
TCP implementations should use Path MTU Discovery techniques (PMTUD). TCP implementations should use Path MTU Discovery techniques (PMTUD).
PMTUD relies on ICMP 'need to frag' messages to learn the path MTU. PMTUD relies on ICMP 'need to frag' messages to learn the path MTU.
When a device has a packet to send which has the Don't Fragment (DF) When a device has a packet to send which has the Don't Fragment (DF)
bit in the IP header set and the packet is larger than the Maximum bit in the IP header set and the packet is larger than the Maximum
Transmission Unit (MTU) of the next hop, the packet is dropped and Transmission Unit (MTU) of the next hop, the packet is dropped and
the device sends an ICMP 'need to frag' message back to the host that the device sends an ICMP 'need to frag' message back to the host that
originated the packet. The ICMP 'need to frag' message includes originated the packet. The ICMP 'need to frag' message includes
skipping to change at page 9, line 56 skipping to change at page 9, line 54
be conducted to verify the network path MTU. PLPMTUD can be used be conducted to verify the network path MTU. PLPMTUD can be used
with or without ICMP. The following sections provide a summary of the with or without ICMP. The following sections provide a summary of the
PLPMTUD approach and an example using TCP. [RFC4821] specifies a PLPMTUD approach and an example using TCP. [RFC4821] specifies a
search_high and a search_low parameter for the MTU. As specified in search_high and a search_low parameter for the MTU. As specified in
[RFC4821], 1024 Bytes is a safe value for search_low in modern [RFC4821], 1024 Bytes is a safe value for search_low in modern
networks. networks.
It is important to determine the links overhead along the IP path, It is important to determine the links overhead along the IP path,
and then to select a TCP MSS size corresponding to the Layer 3 MTU. and then to select a TCP MSS size corresponding to the Layer 3 MTU.
For example, if the MTU is 1024 Bytes and the TCP/IP headers are 40 For example, if the MTU is 1024 Bytes and the TCP/IP headers are 40
Bytes, then the MSS would be set to 984 Bytes. Bytes, (20 for IP + 20 for TCP) then the MSS would be 984 Bytes.
An example scenario is a network where the actual path MTU is 1240 An example scenario is a network where the actual path MTU is 1240
Bytes. The TCP client probe MUST be capable of setting the MSS for Bytes. The TCP client probe MUST be capable of setting the MSS for
the probe packets and could start at MSS = 984 (which corresponds the probe packets and could start at MSS = 984 (which corresponds
to an MTU size of 1024 Bytes). to an MTU size of 1024 Bytes).
The TCP client probe would open a TCP connection and advertise the The TCP client probe would open a TCP connection and advertise the
MSS as 984. Note that the client probe MUST generate these packets MSS as 984. Note that the client probe MUST generate these packets
with the DF bit set. The TCP client probe then sends test traffic with the DF bit set. The TCP client probe then sends test traffic
per a small default Send Socket Buffer size of ~8KBytes. It should per a small default Send Socket Buffer size of ~8KBytes. It should
be kept small to minimize the possibility of congesting the network, be kept small to minimize the possibility of congesting the network,
which may induce packet loss. The duration of the test should also which may induce packet loss. The duration of the test should also
be short (10-30 seconds), again to minimize congestive effects be short (10-30 seconds), again to minimize congestive effects
during the test. during the test.
In the example of a 1240 Bytes path MTU, probing with an MSS equal to In the example of a 1240 Bytes path MTU, probing with an MSS equal to
984 would yield a successful probe and the test client packets would 984 would yield a successful probe and the test client packets would
be successfully transferred to the test server. be successfully transferred to the test server.
Also note that the test client MUST verify that the MSS advertised Also note that the test client MUST verify that the advertised MSS
is indeed negotiated. Network devices with built-in Layer 4 is indeed negotiated. Network devices with built-in Layer 4
capabilities can intercede during the connection establishment and capabilities can intercede during the connection establishment and
reduce the advertised MSS to avoid fragmentation. This is certainly reduce the advertised MSS to avoid fragmentation. This is certainly
a desirable feature from a network perspective, but it can yield a desirable feature from a network perspective, but it can yield
erroneous test results if the client test probe does not confirm the erroneous test results if the client test probe does not confirm the
negotiated MSS. negotiated MSS.
The next test probe would use the search_high value and this would The next test probe would use the search_high value and it would be
be set to MSS = 1460 to correspond to a 1500 Bytes MTU. In this set to a MSS of 1460 in order to produce a 1500 Bytes MTU. In this
example, the test client will retransmit based upon time-outs, since example, the test client will retransmit based upon time-outs, since
no ACKs will be received from the test server. This test probe is no ACKs will be received from the test server. This test probe is
marked as a conclusive failure if none of the test packets are marked as a conclusive failure if none of the test packets are
ACK'ed. If any of the test packets are ACK'ed, congestive network ACK'ed. If none of the test packets are ACK'ed, congestive network
may be the cause and the test probe is not conclusive. Re-testing may be the cause and the test probe is not conclusive. Re-testing
at other times of the day is recommended to further isolate. at another time is recommended to further isolate.
The test is repeated until the desired granularity of the MTU is The test is repeated until the desired granularity of the MTU is
discovered. The method can yield precise results at the expense of discovered. The method can yield precise results at the expense of
probing time. One approach may be to reduce the probe size to probing time. One approach may be to reduce the probe size to
half between the unsuccessful search_high and successful search_low half between the unsuccessful search_high and successful search_low
value and raise it by half also when seeking the upper limit. value and raise it by half when seeking the upper limit.
3.2. Baseline Round Trip Time and Bandwidth 3.2. Baseline Round Trip Time and Bandwidth
Before stateful TCP testing can begin, it is important to determine Before stateful TCP testing can begin, it is important to determine
the baseline Round Trip Time (non-congested inherent delay) and the baseline Round Trip Time (i.e. non-congested inherent delay) and
bottleneck bandwidth of the end-to-end network to be tested. These Bottleneck Bandwidth of the end-to-end network to be tested. These
measurements are used to provide estimates of the ideal maximum TCP measurements are used to calculate the BDP and to provide estimates
RWIN and Send Socket Buffer Sizes that SHOULD be used in of the TCP RWND and Send Socket Buffer Sizes that SHOULD be used in
subsequent test steps. subsequent test steps.
3.2.1 Techniques to Measure Round Trip Time 3.2.1 Techniques to Measure Round Trip Time
Following the definitions used in section 1.1, Round Trip Time (RTT) Following the definitions used in section 1.1, Round Trip Time (RTT)
is the elapsed time between the clocking in of the first bit of a is the elapsed time between the clocking in of the first bit of a
payload sent packet to the receipt of the last bit of the payload sent packet and the receipt of the last bit of the
corresponding Acknowledgment. Round Trip Delay (RTD) is used corresponding Acknowledgment. Round Trip Delay (RTD) is used
synonymously to twice the Link Latency. RTT measurements SHOULD use synonymously to twice the Link Latency. RTT measurements SHOULD use
techniques defined in [RFC2681] or statistics available from MIBs techniques defined in [RFC2681] or statistics available from MIBs
defined in [RFC4898]. defined in [RFC4898].
The RTT SHOULD be baselined during "off-peak" hours to obtain a The RTT SHOULD be baselined during off-peak hours in order to obtain
reliable figure for inherent network latency versus additional delay a reliable figure of the inherent network latency. Otherwise,
caused by network buffering. When sampling values of RTT over a test additional delay caused by network buffering can occur. Also, when
interval, the minimum value measured SHOULD be used as the baseline sampling RTT values over a given test interval, the minimum
RTT since this will most closely estimate the inherent network measured value SHOULD be used as the baseline RTT. This will most
latency. This inherent RTT is also used to determine the Buffer closely estimate the real inherent RTT. This value is also used to
Delay Percentage metric which is defined in Section 3.3.2 determine the Buffer Delay Percentage metric defined in Section 3.3.2
The following list is not meant to be exhaustive, although it The following list is not meant to be exhaustive, although it
summarizes some of the most common ways to determine round trip time. summarizes some of the most common ways to determine Round Trip Time.
The desired resolution of the measurement (i.e. msec versus usec) may The desired measurement precision (i.e. msec versus usec) may dictate
dictate whether the RTT measurement can be achieved with ICMP pings whether the RTT measurement can be achieved with ICMP pings or by a
or by a dedicated communications test instrument with precision dedicated communications test instrument with precision timers.
timers.
The objective in this section is to list several techniques The objective in this section is to list several techniques
in order of decreasing accuracy. in order of decreasing accuracy.
- Use test equipment on each end of the network, "looping" the - Use test equipment on each end of the network, "looping" the
far-end tester so that a packet stream can be measured back and forth far-end tester so that a packet stream can be measured back and forth
from end-to-end. This RTT measurement may be compatible with delay from end-to-end. This RTT measurement may be compatible with delay
measurement protocols specified in [RFC5357]. measurement protocols specified in [RFC5357].
- Conduct packet captures of TCP test sessions using "iperf" or FTP, - Conduct packet captures of TCP test sessions using "iperf" or FTP,
or other TCP test applications. By running multiple experiments, or other TCP test applications. By running multiple experiments,
packet captures can then be analyzed to estimate RTT. It is packet captures can then be analyzed to estimate RTT. It is
important to note that results based upon the SYN -> SYN-ACK at the important to note that results based upon the SYN -> SYN-ACK at the
beginning of TCP sessions should be avoided since Firewalls might beginning of TCP sessions should be avoided since Firewalls might
slow down 3 way handshakes. slow down 3 way handshakes. Also, at the senders side, Ostermann's
LINUX TCPTRACE utility with -l -r arguments can be used to extract
the RTT results directly from the packet captures.
- ICMP pings may also be adequate to provide round trip time - ICMP pings may also be adequate to provide Round Trip Time
estimates, provided that the packet size is factored into the estimates, provided that the packet size is factored into the
estimates (i.e. pings with different packet sizes might be required). estimates (i.e. pings with different packet sizes might be required).
Some limitations with ICMP Ping may include msec resolution and Some limitations with ICMP Ping may include msec resolution and
whether the network elements are responding to pings or not. Also, whether the network elements are responding to pings or not. Also,
ICMP is often rate-limited and segregated into different buffer ICMP is often rate-limited or segregated into different buffer
queues and is not as reliable and accurate as in-band measurements. queues. ICMP might not work if QoS (Quality of Service)
reclassification is done at any hop. ICMP is not as reliable and
accurate as in-band measurements.
3.2.2 Techniques to Measure end-to-end Bandwidth 3.2.2 Techniques to Measure end-to-end Bandwidth
Before any TCP Throughput test can be done, bandwidth measurement Before any TCP Throughput test can be conducted, bandwidth
tests MUST be run with stateless IP streams (i.e. not stateful TCP) measurement tests MUST be run with stateless IP streams (i.e. not
in order to determine the available bandwidths. These measurements stateful TCP) in order to determine the available path bandwidth.
SHOULD be conducted in both directions of the network, especially for These measurements SHOULD be conducted in both directions,
access networks, which may be asymmetrical. These tests should especially in asymmetrical access networks (e.g. ADSL access).
obviously be performed at various intervals throughout a business day These tests should obviously be performed at various intervals
or even across a week. Ideally, the bandwidth tests should produce throughout a business day or even across a week. Ideally, the
logged outputs of the achieved bandwidths across the tests durations. bandwidth tests should produce logged outputs of the achieved
bandwidths across the complete test duration.
There are many well established techniques available to provide There are many well established techniques available to provide
estimated measures of bandwidth over a network. It is a common estimated measures of bandwidth over a network. It is a common
practice for network providers to conduct Layer2/3 bandwidth capacity practice for network providers to conduct Layer 2/3 bandwidth
tests using [RFC2544], although it is understood that RFC 2544 was capacity tests using [RFC2544], although it is understood that
never meant to be used outside a lab environment. Ideally, these [RFC2544] was never meant to be used outside a lab environment.
bandwidth measurements SHOULD use network capacity techniques as Ideally, these bandwidth measurements SHOULD use network capacity
defined in [RFC5136]. techniques as defined in [RFC5136].
The bandwidth results should be at least 90% of the business customer
SLA or to the IP-type-P Available Path Capacity defined in RFC5136.
3.3. TCP Throughput Tests 3.3. TCP Throughput Tests
This methodology specifically defines TCP throughput techniques to This methodology specifically defines TCP throughput techniques to
verify sustained TCP performance in a managed business IP network, as verify maximum achievable TCP performance in a managed business
defined in section 2.1. This section and others will define the class IP network, as defined in section 2.1. This document defines
method to conduct these sustained TCP throughput tests and guidelines a method to conduct these maximum achievable TCP throughput tests
for the predicted results. as well as guidelines on the predicted results.
With baseline measurements of round trip time and bandwidth With baseline measurements of Round Trip Time and bandwidth from
from section 3.2, a series of single and multiple TCP connection section 3.2, a series of single and multiple TCP connection
throughput tests SHOULD be conducted to baseline network performance throughput tests SHOULD be conducted in order to measure network
against expectations. The number of trials and the type of testing performance against expectations. The number of trials and the type
(single versus multiple connections) will vary according to the of testing (i.e. single versus multiple connections) will vary
intention of the test. One example would be a single connection test according to the intention of the test. One example would be a
in which the throughput achieved by large Send and Receive Socket single connection test in which the throughput achieved by large
Buffers sizes (i.e. 256KB) is to be measured. It would be advisable Send and Receive Socket Buffer sizes (i.e. 256KB) is to be measured.
to test performance at various times of the business day. It would be advisable to test at various times of the business day.
It is RECOMMENDED to run the tests in each direction independently It is RECOMMENDED to run the tests in each direction independently
first, then run both directions simultaneously. In each case, first, then run both directions simultaneously. In each case, the
TCP Transfer Time, TCP Efficiency, and Buffer Delay Percentage MUST TCP Transfer Time, TCP Efficiency, and Buffer Delay Percentage
be measured in each direction. These metrics are defined in 3.3.2. metrics MUST be measured in each direction. These metrics are
defined in 3.3.2.
3.3.1 Calculate Ideal maximum TCP RWIN Size 3.3.1 Calculate minimum required TCP RWND Size
The ideal maximum TCP RWIN Size can be calculated from the The minimum required TCP RWND Size can be calculated from the
bandwidth delay product (BDP), which is: bandwidth delay product (BDP), which is:
BDP (bits) = RTT (sec) x Bandwidth (bps) BDP (bits) = RTT (sec) x Bandwidth (bps)
Note that the RTT is being used as the "Delay" variable in the Note that the RTT is being used as the "Delay" variable in the
BDP calculations. BDP calculations.
Then, by dividing the BDP by 8, we obtain the "ideal" maximum TCP Then, by dividing the BDP by 8, we obtain the minimum required TCP
RWIN Size in Bytes. For optimal results, the Send Socket RWND Size in Bytes. For optimal results, the Send Socket Buffer size
Buffer size must be adjusted to the same value at the opposite end must be adjusted to the same value at the opposite end of the network
of the network path. path.
Ideal maximum TCP RWIN = BDP / 8 Minimum required TCP RWND = BDP / 8
An example would be a T3 link with 25 msec RTT. The BDP would equal An example would be a T3 link with 25 msec RTT. The BDP would equal
~1,105,000 bits and the ideal maximum TCP RWIN would be ~138 KBytes. ~1,105,000 bits and the minimum required TCP RWND would be ~138
KBytes.
Note that separate calculations are required on asymetrical paths. Note that separate calculations are required on asymmetrical paths.
An asymetrical path example would be a 90 msec RTT ADSL line with An asymmetrical path example would be a 90 msec RTT ADSL line with
5Mbps downstream and 640Kbps upstream. The downstream BDP would equal 5Mbps downstream and 640Kbps upstream. The downstream BDP would equal
~450,000 bits while the upstream one would be only ~57,600 bits. ~450,000 bits while the upstream one would be only ~57,600 bits.
The following table provides some representative network Link Speeds, The following table provides some representative network Link Speeds,
RTT, BDP, and associated Ideal maximum TCP RWIN Sizes. RTT, BDP, and their associated minimum required TCP RWND Sizes.
Table 3.3.1: Link Speed, RTT, calculated BDP & max TCP RWIN Table 3.3.1: Link Speed, RTT, calculated BDP & minimum TCP RWND
Link Ideal max Link Minimum required
Speed* RTT BDP TCP RWIN Speed* RTT BDP TCP RWND
(Mbps) (ms) (bits) (KBytes) (Mbps) (ms) (bits) (KBytes)
--------------------------------------------------------------------- ---------------------------------------------------------------------
1.536 20 30,720 3.84 1.536 20 30,720 3.84
1.536 50 76,800 9.60 1.536 50 76,800 9.60
1.536 100 153,600 19.20 1.536 100 153,600 19.20
44.210 10 442,100 55.26 44.210 10 442,100 55.26
44.210 15 663,150 82.89 44.210 15 663,150 82.89
44.210 25 1,105,250 138.16 44.210 25 1,105,250 138.16
100 1 100,000 12.50 100 1 100,000 12.50
100 2 200,000 25.00 100 2 200,000 25.00
100 5 500,000 62.50 100 5 500,000 62.50
1,000 0.1 100,000 12.50 1,000 0.1 100,000 12.50
1,000 0.5 500,000 62.50 1,000 0.5 500,000 62.50
1,000 1 1,000,000 125.00 1,000 1 1,000,000 125.00
10,000 0.05 500,000 62.50 10,000 0.05 500,000 62.50
10,000 0.3 3,000,000 375.00 10,000 0.3 3,000,000 375.00
* Note that link speed is the bottleneck bandwidth (BB) for the NUT * Note that link speed is the Bottleneck Bandwidth (BB) for the NUT
The following serial link speeds are used: The following serial link speeds are used:
- T1 = 1.536 Mbits/sec (for a B8ZS line encoding facility) - T1 = 1.536 Mbits/sec (for a B8ZS line encoding facility)
- T3 = 44.21 Mbits/sec (for a C-Bit Framing facility) - T3 = 44.21 Mbits/sec (for a C-Bit Framing facility)
The above table illustrates the ideal maximum TCP RWIN. The above table illustrates the minimum required TCP RWND.
If a smaller TCP RWIN Size is used, then the TCP Throughput If a smaller TCP RWND Size is used, then the TCP Throughput
is not optimal. To calculate the TCP Throughput, the following can not be optimal. To calculate the TCP Throughput, the following
formula is used: TCP Throughput = max TCP RWIN X 8 / RTT formula is used: TCP Throughput = TCP RWND X 8 / RTT
An example could be a 100 Mbps IP path with 5 ms RTT and a maximum An example could be a 100 Mbps IP path with 5 ms RTT and a TCP RWND
TCP RWIN Size of 16KB, then: of 16KB, then:
TCP Throughput = 16 KBytes X 8 bits / 5 ms. TCP Throughput = 16 KBytes X 8 bits / 5 ms.
TCP Throughput = 128,000 bits / 0.005 sec. TCP Throughput = 128,000 bits / 0.005 sec.
TCP Throughput = 25.6 Mbps. TCP Throughput = 25.6 Mbps.
Another example for a T3 using the same calculation formula is Another example for a T3 using the same calculation formula is
illustrated on the next page: illustrated on the next page:
TCP Throughput = max TCP RWIN X 8 / RTT.
TCP Throughput = 16 KBytes X 8 bits / 10 ms. TCP Throughput = 16 KBytes X 8 bits / 10 ms.
TCP Throughput = 128,000 bits / 0.01 sec. TCP Throughput = 128,000 bits / 0.01 sec.
TCP Throughput = 12.8 Mbps. TCP Throughput = 12.8 Mbps.
When the maximum TCP RWIN Size exceeds the BDP (T3 link, When the TCP RWND Size exceeds the BDP (T3 link and 64 KBytes TCP
64 KBytes max TCP RWIN on a 10 ms RTT path), the maximum RWND on a 10 ms RTT path), the maximum frames per second limit of
frames per second limit of 3664 is reached and the formula is: 3664 is reached and then the formula is:
TCP Throughput = Max FPS X MSS X 8. TCP Throughput = Max FPS X MSS X 8.
TCP Throughput = 3664 FPS X 1460 Bytes X 8 bits. TCP Throughput = 3664 FPS X 1460 Bytes X 8 bits.
TCP Throughput = 42.8 Mbps TCP Throughput = 42.8 Mbps
The following diagram compares achievable TCP throughputs on a T3 The following diagram compares achievable TCP throughputs on a T3
with Send Socket Buffer & max TCP RWIN Sizes of 16KB vs. 64KB. with Send Socket Buffer & TCP RWND Sizes of 16KB vs. 64KB.
Figure 3.3.1a TCP Throughputs on a T3 at different RTTs Figure 3.3.1a TCP Throughputs on a T3 at different RTTs
45| 45|
| _______42.8M | _______42.8M
40| |64KB | 40| |64KB |
TCP | | | TCP | | |
Throughput 35| | | Throughput 35| | |
in Mbps | | | +-----+34.1M in Mbps | | | +-----+34.1M
30| | | |64KB | 30| | | |64KB |
skipping to change at page 15, line 6 skipping to change at page 15, line 6
15| | | | | | | 15| | | | | | |
|12.8M+-----| | | | | | |12.8M+-----| | | | | |
10| |16KB | | | | | | 10| |16KB | | | | | |
| | | |8.5M+-----| | | | | | | |8.5M+-----| | | |
5| | | | |16KB | |5.1M+-----| | 5| | | | |16KB | |5.1M+-----| |
|_____|_____|_____|____|_____|_____|____|16KB |_____|_____ |_____|_____|_____|____|_____|_____|____|16KB |_____|_____
10 15 25 10 15 25
RTT in milliseconds RTT in milliseconds
The following diagram shows the achievable TCP throughput on a 25ms The following diagram shows the achievable TCP throughput on a 25ms
T3 when Send Socket Buffer & maximum TCP RWIN Sizes are increased. T3 when Send Socket Buffer & TCP RWND Sizes are increased.
Figure 3.3.1b TCP Throughputs on a T3 with different TCP RWIN Figure 3.3.1b TCP Throughputs on a T3 with different TCP RWND
45| 45|
| |
40| +-----+40.9M 40| +-----+40.9M
TCP | | | TCP | | |
Throughput 35| | | Throughput 35| | |
in Mbps | | | in Mbps | | |
30| | | 30| | |
| | | | | |
25| | | 25| | |
| | | | | |
20| +-----+20.5M | | 20| +-----+20.5M | |
| | | | | | | | | |
15| | | | | 15| | | | |
| | | | | | | | | |
10| +-----+10.2M | | | | 10| +-----+10.2M | | | |
| | | | | | | | | | | | | |
5| +-----+5.1M | | | | | | 5| +-----+5.1M | | | | | |
|_____|_____|______|_____|______|_____|_______|_____|_____ |_____|_____|______|_____|______|_____|_______|_____|_____
16 32 64 128* 16 32 64 128*
maximum TCP RWIN Size in KBytes TCP RWND Size in KBytes
* Note that 128KB requires [RFC1323] TCP Window scaling option. * Note that 128KB requires [RFC1323] TCP Window scaling option.
3.3.2 Metrics for TCP Throughput Tests 3.3.2 Metrics for TCP Throughput Tests
This framework focuses on a TCP throughput methodology and also This framework focuses on a TCP throughput methodology and also
provides several basic metrics to compare results of various provides several basic metrics to compare results between various
throughput tests. It is recognized that the complexity and throughput tests. It is recognized that the complexity and
unpredictability of TCP makes it impossible to develop a complete unpredictability of TCP makes it impossible to develop a complete
set of metrics that accounts for the myriad of variables (i.e. RTT set of metrics that accounts for the myriad of variables (i.e. RTT
variation, loss conditions, TCP implementation, etc.). However, variation, loss conditions, TCP implementation, etc.). However,
these basic metrics will facilitate TCP throughput comparisons these basic metrics will facilitate TCP throughput comparisons
under varying network conditions and between network traffic under varying network conditions and between network traffic
management techniques. management techniques.
The first metric is the TCP Transfer Time, which is simply the The first metric is the TCP Transfer Time, which is simply the
measured time it takes to transfer a block of data across measured time required to transfer a block of data across
simultaneous TCP connections. This concept is useful when simultaneous TCP connections. This concept is useful when
benchmarking traffic management techniques and where multiple benchmarking traffic management techniques and when multiple
TCP connections are required. TCP connections are required.
TCP Transfer time may also be used to provide a normalized ratio of TCP Transfer time may also be used to provide a normalized ratio of
the actual TCP Transfer Time versus the Ideal Transfer Time. This the actual TCP Transfer Time versus the Ideal Transfer Time. This
ratio is called the TCP Transfer Index and is defined as: ratio is called the TCP Transfer Index and is defined as:
Actual TCP Transfer Time Actual TCP Transfer Time
------------------------- -------------------------
Ideal TCP Transfer Time Ideal TCP Transfer Time
The Ideal TCP Transfer time is derived from the network path The Ideal TCP Transfer time is derived from the network path
bottleneck bandwidth and various Layer 1/2/3/4 overheads associated Bottleneck Bandwidth and Layer 1/2/3/4 overheads associated with the
with the network path. Additionally, both the maximum TCP RWIN and network path. Additionally, both the TCP RWND and the Send Socket
the Send Socket Buffer Sizes must be tuned to equal the bandwidth Buffer Sizes must be tuned to equal or exceed the bandwidth delay
delay product (BDP) as described in section 3.3.1. product (BDP) as described in section 3.3.1.
The following table illustrates the Ideal TCP Transfer time of a The following table illustrates the Ideal TCP Transfer time of a
single TCP connection when its maximum TCP RWIN and Send Socket single TCP connection when its TCP RWND and Send Socket Buffer Sizes
Buffer Sizes are equal to the BDP. equals or exceeds the BDP.
Table 3.3.2: Link Speed, RTT, BDP, TCP Throughput, and Table 3.3.2: Link Speed, RTT, BDP, TCP Throughput, and
Ideal TCP Transfer time for a 100 MB File Ideal TCP Transfer time for a 100 MB File
Link Maximum Ideal TCP Link Maximum Ideal TCP
Speed BDP Achievable TCP Transfer time Speed BDP Achievable TCP Transfer time
(Mbps) RTT (ms) (KBytes) Throughput(Mbps) (seconds) (Mbps) RTT (ms) (KBytes) Throughput(Mbps) (seconds)
-------------------------------------------------------------------- --------------------------------------------------------------------
1.536 50 9.6 1.4 571 1.536 50 9.6 1.4 571
44.21 25 138.2 42.8 18 44.21 25 138.2 42.8 18
skipping to change at page 16, line 39 skipping to change at page 16, line 39
For a 100MB file(100 x 8 = 800 Mbits), the Ideal TCP Transfer Time For a 100MB file(100 x 8 = 800 Mbits), the Ideal TCP Transfer Time
is derived as follows: is derived as follows:
800 Mbits 800 Mbits
Ideal TCP Transfer Time = ----------------------------------- Ideal TCP Transfer Time = -----------------------------------
Maximum Achievable TCP Throughput Maximum Achievable TCP Throughput
The maximum achievable layer 2 throughput on T1 and T3 Interfaces The maximum achievable layer 2 throughput on T1 and T3 Interfaces
is based on the maximum frames per second (FPS) permitted by the is based on the maximum frames per second (FPS) permitted by the
actual layer 1 speed when the MTU is 1500 Bytes. actual layer 1 speed with an MTU of 1500 Bytes.
The maximum FPS for a T1 is 127 and the calculation formula is: The maximum FPS for a T1 is 127 and the calculation formula is:
FPS = T1 Link Speed / ((MTU + PPP + Flags + CRC16) X 8) FPS = T1 Link Speed / ((MTU + PPP + Flags + CRC16) X 8)
FPS = (1.536M /((1500 Bytes + 4 Bytes + 2 Bytes + 2 Bytes) X 8 ))) FPS = (1.536M /((1500 Bytes + 4 Bytes + 2 Bytes + 2 Bytes) X 8 )))
FPS = (1.536M / (1508 Bytes X 8)) FPS = (1.536M / (1508 Bytes X 8))
FPS = 1.536 Mbps / 12064 bits FPS = 1.536 Mbps / 12064 bits
FPS = 127 FPS = 127
The maximum FPS for a T3 is 3664 and the calculation formula is: The maximum FPS for a T3 is 3664 and the calculation formula is:
FPS = T3 Link Speed / ((MTU + PPP + Flags + CRC16) X 8) FPS = T3 Link Speed / ((MTU + PPP + Flags + CRC16) X 8)
FPS = (44.21M /((1500 Bytes + 4 Bytes + 2 Bytes + 2 Bytes) X 8 ))) FPS = (44.21M /((1500 Bytes + 4 Bytes + 2 Bytes + 2 Bytes) X 8 )))
FPS = (44.21M / (1508 Bytes X 8)) FPS = (44.21M / (1508 Bytes X 8))
FPS = 44.21 Mbps / 12064 bits FPS = 44.21 Mbps / 12064 bits
FPS = 3664 FPS = 3664
The 1508 equates to: The 1508 equates to:
MTU + PPP + Flags + CRC16 MTU + PPP + Flags + CRC16
Where MTU is 1500 Bytes, PPP is 4 Bytes, Flags are 2 Bytes and CRC16 Where the MTU is 1500 Bytes, PPP is 4 Bytes, the 2 Flags are 1 Byte
is 2 Bytes. each and the CRC16 is 2 Bytes.
Then, to obtain the Maximum Achievable TCP Throughput (layer 4), we Then, to obtain the Maximum Achievable TCP Throughput (layer 4), we
simply use: MSS in Bytes X 8 bits X max FPS. simply use: MSS in Bytes X 8 bits X max FPS.
For a T3, the maximum TCP Throughput = 1460 Bytes X 8 bits X 3664 FPS For a T3, the maximum TCP Throughput = 1460 Bytes X 8 bits X 3664 FPS
Maximum TCP Throughput = 11680 bits X 3664 FPS Maximum TCP Throughput = 11680 bits X 3664 FPS
Maximum TCP Throughput = 42.8 Mbps. Maximum TCP Throughput = 42.8 Mbps.
The maximum achievable layer 2 throughput on Ethernet Interfaces is The maximum achievable layer 2 throughput on Ethernet Interfaces is
based on the maximum frames per second permitted by the IEEE802.3 based on the maximum frames per second permitted by the IEEE802.3
standard when the MTU is 1500 Bytes. standard when the MTU is 1500 Bytes.
skipping to change at page 17, line 33 skipping to change at page 17, line 33
The maximum FPS for GigE is 81274 and the calculation formula is: The maximum FPS for GigE is 81274 and the calculation formula is:
FPS = (1Gbps /(1538 Bytes X 8 bits)) FPS = (1Gbps /(1538 Bytes X 8 bits))
The maximum FPS for 10GigE is 812743 and the calculation formula is: The maximum FPS for 10GigE is 812743 and the calculation formula is:
FPS = (10Gbps /(1538 Bytes X 8 bits)) FPS = (10Gbps /(1538 Bytes X 8 bits))
The 1538 equates to: The 1538 equates to:
MTU + Eth + CRC32 + IFG + Preamble + SFD MTU + Eth + CRC32 + IFG + Preamble + SFD
(IFG = Inter-Frame Gap and SFD = Start of Frame Delimiter)
Where MTU is 1500 Bytes, Ethernet is 14 Bytes, CRC32 is 4 Bytes, Where MTU is 1500 Bytes, Ethernet is 14 Bytes, CRC32 is 4 Bytes,
IFG is 12 Bytes, Preamble is 7 Bytes and SFD is 1 Byte. IFG is 12 Bytes, Preamble is 7 Bytes and SFD is 1 Byte.
Note that better results could be obtained with jumbo frames on Note that better results could be obtained with jumbo frames on
GigE and 10 GigE. GigE and 10 GigE.
Then, to obtain the Maximum Achievable TCP Throughput (layer 4), we Then, to obtain the Maximum Achievable TCP Throughput (layer 4), we
simply use: MSS in Bytes X 8 bits X max FPS. simply use: MSS in Bytes X 8 bits X max FPS.
For a 100M, the maximum TCP Throughput = 1460 B X 8 bits X 8127 FPS For a 100M, the maximum TCP Throughput = 1460 B X 8 bits X 8127 FPS
Maximum TCP Throughput = 11680 bits X 8127 FPS Maximum TCP Throughput = 11680 bits X 8127 FPS
Maximum TCP Throughput = 94.9 Mbps. Maximum TCP Throughput = 94.9 Mbps.
To illustrate the TCP Transfer Time Index, an example would be the To illustrate the TCP Transfer Time Index, an example would be the
bulk transfer of 100 MB over 5 simultaneous TCP connections (each bulk transfer of 100 MB over 5 simultaneous TCP connections (each
connection uploading 100 MB). In this example, the Ethernet service connection transferring 100 MB). In this example, the Ethernet
provides a Committed Access Rate (CAR) of 500 Mbit/s. Each service provides a Committed Access Rate (CAR) of 500 Mbit/s. Each
connection may achieve different throughputs during a test and the connection may achieve different throughputs during a test and the
overall throughput rate is not always easy to determine (especially overall throughput rate is not always easy to determine (especially
as the number of connections increases). as the number of connections increases).
The ideal TCP Transfer Time would be ~8 seconds, but in this example, The ideal TCP Transfer Time would be ~8 seconds, but in this example,
the actual TCP Transfer Time was 12 seconds. The TCP Transfer Index the actual TCP Transfer Time was 12 seconds. The TCP Transfer Index
would then be 12/8 = 1.5, which indicates that the transfer across would then be 12/8 = 1.5, which indicates that the transfer across
all connections took 1.5 times longer than the ideal. all connections took 1.5 times longer than the ideal.
The second metric is TCP Efficiency, which is the percentage of Bytes The second metric is TCP Efficiency, which is the percentage of Bytes
that were not retransmitted and is defined as: that were not retransmitted and is defined as:
Transmitted Bytes - Retransmitted Bytes Transmitted Bytes - Retransmitted Bytes
--------------------------------------- x 100 --------------------------------------- x 100
Transmitted Bytes Transmitted Bytes
Transmitted Bytes are the total number of TCP payload Bytes to be Transmitted Bytes are the total number of TCP Bytes to be transmitted
transmitted which includes the original and retransmitted Bytes. This including the original and the retransmitted Bytes. This metric
metric provides a comparative measure between various QoS mechanisms provides comparative results between various traffic management and
like traffic management or congestion avoidance. Various TCP congestion avoidance mechanisms. Performance between different TCP
implementations like Reno, Vegas, etc. could also be compared. implementations could also be compared. (e.g. Reno, Vegas, etc).
As an example, if 100,000 Bytes were sent and 2,000 had to be As an example, if 100,000 Bytes were sent and 2,000 had to be
retransmitted, the TCP Efficiency should be calculated as: retransmitted, the TCP Efficiency should be calculated as:
102,000 - 2,000 102,000 - 2,000
---------------- x 100 = 98.03% ---------------- x 100 = 98.03%
102,000 102,000
Note that the retransmitted Bytes may have occurred more than once, Note that the Retransmitted Bytes may have occurred more than once,
and these multiple retransmissions are added to the Retransmitted if so, then these multiple retransmissions are added to the
Bytes count (and the Transmitted Bytes count). Retransmitted Bytes and to the Transmitted Bytes counts.
The third metric is the Buffer Delay Percentage, which represents the The third metric is the Buffer Delay Percentage, which represents the
increase in RTT during a TCP throughput test with respect to increase in RTT during a TCP throughput test versus the inherent or
inherent or baseline network RTT. The baseline RTT is the round-trip baseline RTT. The baseline RTT is the Round Trip Time inherent to
time inherent to the network path under non-congested conditions. the network path under non-congested conditions.
(See 3.2.1 for details concerning the baseline RTT measurements). (See 3.2.1 for details concerning the baseline RTT measurements).
The Buffer Delay Percentage is defined as: The Buffer Delay Percentage is defined as:
Average RTT during Transfer - Baseline RTT Average RTT during Transfer - Baseline RTT
------------------------------------------ x 100 ------------------------------------------ x 100
Baseline RTT Baseline RTT
As an example, the baseline RTT for the network path is 25 msec. As an example, consider a network path with a baseline RTT of 25
During the course of a TCP transfer, the average RTT across the msec. During the course of a TCP transfer, the average RTT across
entire transfer increased to 32 msec. In this example, the Buffer the entire transfer increases to 32 msec. Then, the Buffer Delay
Delay Percentage would be calculated as: Percentage would be calculated as:
32 - 25 32 - 25
------- x 100 = 28% ------- x 100 = 28%
25 25
Note that the TCP Transfer Time, TCP Efficiency, and Buffer Delay Note that the TCP Transfer Time, TCP Efficiency, and Buffer Delay
Percentage MUST be measured during each throughput test. Poor TCP Percentage MUST be measured during each throughput test. Poor TCP
Transfer Time Indexes (TCP Transfer Time greater than Ideal TCP Transfer Time Indexes (TCP Transfer Time greater than Ideal TCP
Transfer Times) may be diagnosed by correlating with sub-optimal TCP Transfer Times) may be diagnosed by correlating with sub-optimal TCP
Efficiency and/or Buffer Delay Percentage metrics. Efficiency and/or Buffer Delay Percentage metrics.
3.3.3 Conducting the TCP Throughput Tests 3.3.3 Conducting the TCP Throughput Tests
Several TCP tools are currently used in the network world and one of Several TCP tools are currently used in the network world and one of
the most common is "iperf". With this tool, hosts are installed at the most common is "iperf". With this tool, hosts are installed at
each end of the network path; one acts as client and the other as each end of the network path; one acts as client and the other as
a server. The Send Socket Buffer and the maximum TCP RWIN Sizes a server. The Send Socket Buffer and the TCP RWND Sizes of both
of both client and server can be manually set. The achieved client and server can be manually set. The achieved throughput can
throughput can then be measured, either uni-directionally or then be measured, either uni-directionally or bi-directionally. For
bi-directionally. For higher BDP situations in lossy networks higher BDP situations in lossy networks (long fat networks or
(long fat networks or satellite links, etc.), TCP options such as satellite links, etc.), TCP options such as Selective Acknowledgment
Selective Acknowledgment SHOULD be considered and become part of SHOULD be considered and become part of the window size / throughput
the window size / throughput characterization. characterization.
Host hardware performance must be well understood before conducting Host hardware performance must be well understood before conducting
the tests described in the following sections. A dedicated the tests described in the following sections. A dedicated
communications test instrument will generally be required, especially communications test instrument will generally be required, especially
for line rates of GigE and 10 GigE. A compliant TCP TTD SHOULD for line rates of GigE and 10 GigE. A compliant TCP TTD SHOULD
provide a warning message when the expected test throughput will provide a warning message when the expected test throughput will
exceed 10% of the network bandwidth capacity. If the throughput test exceed 10% of the network bandwidth capacity. If the throughput test
is expected to exceed 10% of the provider bandwidth, then the test is expected to exceed 10% of the provider bandwidth, then the test
should be coordinated with the network provider. This does not should be coordinated with the network provider. This does not
include the customer premise bandwidth, the 10% refers directly to include the customer premise bandwidth, the 10% refers directly to
the provider's bandwidth (Provider Edge to Provider router). the provider's bandwidth (Provider Edge to Provider router).
The TCP throughput test should be run over a long enough duration The TCP throughput test should be run over a long enough duration
to properly exercise network buffers (greater than 30 seconds) and to properly exercise network buffers (i.e. greater than 30 seconds)
also characterize performance at different time periods of the day. and should also characterize performance at different times of day.
3.3.4 Single vs. Multiple TCP Connection Testing 3.3.4 Single vs. Multiple TCP Connection Testing
The decision whether to conduct single or multiple TCP connection The decision whether to conduct single or multiple TCP connection
tests depends upon the size of the BDP in relation to the maximum tests depends upon the size of the BDP in relation to the TCP RWND
TCP RWIN configured in the end-user environment. For example, if configured in the end-user environment. For example, if the BDP for
the BDP for a long fat network turns out to be 2MB, then it is a long fat network turns out to be 2MB, then it is probably more
probably more realistic to test this network path with multiple realistic to test this network path with multiple connections.
connections. Assuming typical host computer maximum TCP RWIN Sizes Assuming typical host computer TCP RWND Sizes of 64 KB (i.e. Windows
of 64 KB, using 32 TCP connections would realistically test this XP), using 32 TCP connections would emulate a typical small office
path. scenario.
The following table is provided to illustrate the relationship The following table is provided to illustrate the relationship
between the maximum TCP RWIN and the number of TCP connections between the TCP RWND and the number of TCP connections required to
required to utilize the available capacity of a given BDP. For this fill the available capacity of a given BDP. For this example, the
example, the network bandwidth is 500 Mbps and the RTT is 5 ms, then network bandwidth is 500 Mbps and the RTT is 5 ms, then the BDP
the BDP equates to 312.5 KBytes. equates to 312.5 KBytes.
Table 3.3.4 Number of TCP connections versus maximum TCP RWIN Table 3.3.4 Number of TCP connections versus TCP RWND
Maximum Number of TCP Connections Number of TCP Connections
TCP RWIN to fill available bandwidth TCP RWND to fill available bandwidth
------------------------------------- -------------------------------------
16KB 20 16KB 20
32KB 10 32KB 10
64KB 5 64KB 5
128KB 3 128KB 3
The TCP Transfer Time metric is useful for conducting multiple The TCP Transfer Time metric is useful for conducting multiple
connection tests. Each connection should be configured to transfer connection tests. Each connection should be configured to transfer
payloads of the same size (i.e. 100 MB), and the TCP Transfer time payloads of the same size (i.e. 100 MB), and the TCP Transfer time
should provide a simple metric to verify the actual versus expected provides a simple metric to verify the actual versus expected
results. results.
Note that the TCP transfer time is the time for all connections to Note that the TCP transfer time is the time for all connections to
complete the transfer of the configured payload size. From the complete the transfer of the configured payload size. From the
previous table, the 64KB window is considered. Each of the 5 previous table, the 64KB window is considered. Each of the 5
TCP connections would be configured to transfer 100MB, and each one TCP connections would be configured to transfer 100MB, and each one
should obtain a maximum of 100 Mb/sec. So for this example, the should obtain a maximum of 100 Mb/sec. So for this example, the
100MB payload should be transferred across the connections in 100MB payload should be transferred across the connections in
approximately 8 seconds (which would be the ideal TCP transfer time approximately 8 seconds (which would be the ideal TCP transfer time
under these conditions). under these conditions).
Additionally, the TCP Efficiency metric MUST be computed for each Additionally, the TCP Efficiency metric MUST be computed for each
connection tested as defined in section 3.3.2. connection as defined in section 3.3.2.
3.3.5 Interpretation of the TCP Throughput Results 3.3.5 Interpretation of the TCP Throughput Results
At the end of this step, the user will document the theoretical BDP At the end of this step, the user will document the theoretical BDP
and a set of Window size experiments with measured TCP throughput for and a set of Window size experiments with measured TCP throughput for
each TCP window size. For cases where the sustained TCP throughput each TCP window size. For cases where the sustained TCP throughput
does not equal the ideal value, some possible causes are: does not equal the ideal value, some possible causes are:
- Network congestion causing packet loss which MAY be inferred from - Network congestion causing packet loss which MAY be inferred from
a poor TCP Efficiency % (higher TCP Efficiency % = less packet a poor TCP Efficiency % (higher TCP Efficiency % = less packet
loss) loss)
- Network congestion causing an increase in RTT which MAY be inferred - Network congestion causing an increase in RTT which MAY be inferred
from the Buffer Delay Percentage (i.e., 0% = no increase in RTT from the Buffer Delay Percentage (i.e., 0% = no increase in RTT
over baseline) over baseline)
- Intermediate network devices which actively regenerate the TCP - Intermediate network devices which actively regenerate the TCP
connection and can alter TCP RWIN Size, MSS, etc. connection and can alter TCP RWND Size, MSS, etc.
- Rate limiting (policing). More details on traffic management - Rate limiting (policing). More details on traffic management
tests follows in section 3.4 tests follows in section 3.4
3.3.6 High Performance Network Options 3.3.6 High Performance Network Options
For cases where the network outperforms the client/server IP hosts For cases where the network outperforms the client/server IP hosts
some possible causes are: some possible causes are:
- Maximum TCP Buffer space. All operating systems have a global - Maximum TCP Buffer space. All operating systems have a global
mechanism to limit the quantity of system memory to be used by TCP mechanism to limit the quantity of system memory to be used by TCP
connections. On some systems, each connection is subject to a memory connections. On some systems, each connection is subject to a memory
limit that is applied to the total memory used for input data, output limit that is applied to the total memory used for input data, output
data and controls. On other systems, there are separate limits for data and controls. On other systems, there are separate limits for
input and output buffer spaces per connection. Client/server IP input and output buffer spaces per connection. Client/server IP
hosts might be configured with Maximum Buffer Space limits that are hosts might be configured with Maximum Buffer Space limits that are
far too small for high performance networks. far too small for high performance networks.
- Socket Buffer Sizes. Most operating systems support separate per - Socket Buffer Sizes. Most operating systems support separate per
connection send and receive buffer limits that can be adjusted as connection send and receive buffer limits that can be adjusted as
long as they stay within the maximum memory limits. These socket long as they stay within the maximum memory limits. These socket
buffers must be large enough to hold a full BDP of TCP segments plus buffers must be large enough to hold a full BDP of TCP Bytes plus
some overhead. There are several methods that can be used to adjust some overhead. There are several methods that can be used to adjust
socket buffer sizes, but TCP Auto-Tuning automatically adjusts these socket buffer sizes, but TCP Auto-Tuning automatically adjusts these
as needed to optimally balance TCP performance and memory usage. as needed to optimally balance TCP performance and memory usage.
It is important to note that Auto-Tuning is enabled by default in It is important to note that Auto-Tuning is enabled by default in
LINUX since the kernel release 2.6.6 and in UNIX since FreeBSD 7.0. LINUX since the kernel release 2.6.6 and in UNIX since FreeBSD 7.0.
It is also enabled by default in Windows since Vista and in MAC since It is also enabled by default in Windows since Vista and in MAC since
OS X version 10.5 (leopard). Over buffering can cause some OS X version 10.5 (leopard). Over buffering can cause some
applications to behave poorly, typically causing sluggish interactive applications to behave poorly, typically causing sluggish interactive
response and risk running the system out of memory. Large default response and risk running the system out of memory. Large default
socket buffers have to be considered carefully on multi-user systems. socket buffers have to be considered carefully on multi-user systems.
- TCP Window Scale Option, RFC1323. This option enables TCP to - TCP Window Scale Option, RFC1323. This option enables TCP to
support large BDP paths. It provides a scale factor which is support large BDP paths. It provides a scale factor which is
required for TCP to support window sizes larger than 64KB. Most required for TCP to support window sizes larger than 64KB. Most
systems automatically request WSCALE under some conditions, such as systems automatically request WSCALE under some conditions, such as
when the receive socket buffer is larger than 64KB or when the other when the receive socket buffer is larger than 64KB or when the other
end of the TCP connection requests it first. WSCALE can only be end of the TCP connection requests it first. WSCALE can only be
negotiated during the 3 way handhsake. If either end fails to negotiated during the 3 way handshake. If either end fails to
request WSCALE or requests an insufficient value, it cannot be request WSCALE or requests an insufficient value, it cannot be
renegotiated. Different systems use different algorithms to select renegotiated. Different systems use different algorithms to select
WSCALE, but they are all tributary to the maximum permitted buffer WSCALE, but it is very important to have large enough buffer
size, the current receiver buffer size for this connection, or a sizes. Note that under these constraints, a client application
global system setting. Note that under these constraints, a client wishing to send data at high rates may need to set its own receive
application wishing to send data at high rates may need to set its buffer to something larger than 64K Bytes before it opens the
own receive buffer to something larger than 64K Bytes before it connection to ensure that the server properly negotiates WSCALE.
opens the connection to ensure that the server properly negotiates A system administrator might have to explicitly enable RFC1323
WSCALE. A system administrator might have to explicitly enable extensions. Otherwise, the client/server IP host would not support
RFC1323 extensions. Otherwise, the client/server IP host would not TCP window sizes (BDP) larger than 64KB. Most of the time,
support TCP window sizes (BDP) larger than 64KB. Most of the time,
performance gains will be obtained by enabling this option in Long performance gains will be obtained by enabling this option in Long
Fat Networks. (i.e.Networks with large BDP, see Figure 3.3.1b). Fat Networks. (i.e., networks with large BDP, see Figure 3.3.1b).
- TCP Timestamps Option, RFC1323. This feature provides better - TCP Timestamps Option, RFC1323. This feature provides better
measurements of the Round Trip Time and protects TCP from data measurements of the Round Trip Time and protects TCP from data
corruption that might occur if packets are delivered so late that the corruption that might occur if packets are delivered so late that the
sequence numbers wrap before they are delivered. Wrapped sequence sequence numbers wrap before they are delivered. Wrapped sequence
numbers do not pose a serious risk below 100 Mbps, but the risk numbers do not pose a serious risk below 100 Mbps, but the risk
increases at higher data rates. Most of the time, performance gains increases at higher data rates. Most of the time, performance gains
will be obtained by enabling this option in Gigabit bandwidth will be obtained by enabling this option in Gigabit bandwidth
networks. networks.
- TCP Selective Acknowledgments Option (SACK), RFC2018. This allows - TCP Selective Acknowledgments Option (SACK), RFC2018. This allows
a TCP receiver to inform the sender about exactly which data segment a TCP receiver to inform the sender about exactly which data segment
is missing and needs to be retransmitted. Without SACK, TCP has to is missing and needs to be retransmitted. Without SACK, TCP has to
estimate which data segment is missing, which works just fine if all estimate which data segment is missing, which works just fine if all
losses are isolated (i.e. only one loss in any given round trip). losses are isolated (i.e. only one loss in any given round trip).
Without SACK, TCP takes a very long time to recover after multiple Without SACK, TCP takes a very long time to recover after multiple
and consecutive losses. SACK is now supported by most operating and consecutive losses. SACK is now supported by most operating
systems, but it may have to be explicitly enabled by the system systems, but it may have to be explicitly enabled by the system
administrator. In most situations, enabling TCP SACK will improve administrator. In networks with unknown load and error patterns, TCP
throughput performances, but it is important to note that it might SACK will improve throughput performances. On the other hand,
need to be disabled in network architectures where TCP randomization security appliances vendors might have implemented TCP randomization
is done by network security appliances. without considering TCP SACK and under such circumstances, SACK might
need to be disabled in the client/server IP hosts until the vendor
corrects the issue. Also, poorly implemented SACK algorithms might
cause extreme CPU loads and might need to be disabled.
- Path MTU. The client/server IP host system must use the largest - Path MTU. The client/server IP host system must use the largest
possible MTU for the path. This may require enabling Path MTU possible MTU for the path. This may require enabling Path MTU
Discovery (RFC1191 & RFC4821). Since RFC1191 is flawed it is Discovery (RFC1191 & RFC4821). Since RFC1191 is flawed it is
sometimes not enabled by default and may need to be explicitly sometimes not enabled by default and may need to be explicitly
enabled by the system administrator. RFC4821 describes a new, more enabled by the system administrator. RFC4821 describes a new, more
robust algorithm for MTU discovery and ICMP black hole recovery. robust algorithm for MTU discovery and ICMP black hole recovery.
- TOE (TCP Offload Engine). Some recent Network Interface Cards (NIC) - TOE (TCP Offload Engine). Some recent Network Interface Cards (NIC)
are equipped with drivers that can do part or all of the TCP/IP are equipped with drivers that can do part or all of the TCP/IP
protocol processing. TOE implementations require additional work protocol processing. TOE implementations require additional work
(i.e. hardware-specific socket manipulation) to set up and tear down (i.e. hardware-specific socket manipulation) to set up and tear down
connections. For connection intensive protocols such as HTTP, TOE connections. Because TOE NICs configuration parameters are vendor
might need to be disabled to increase performances. Because TOE NICs specific and not necessarily RFC-compliant, they are poorly
configuration parameters are vendor specific and not necessarily integrated with UNIX & LINUX. Occasionally, TOE might need to be
RFC-compliant, they are poorly integrated with UNIX & LINUX. disabled in a server because its NIC does not have enough memory
Occasionally, TOE might need to be disabled in a server because its resources to buffer thousands of connections.
NIC does not have enough memory resources to buffer thousands of
connections.
Note that both ends of a TCP connection must be properly tuned. Note that both ends of a TCP connection must be properly tuned.
3.4. Traffic Management Tests 3.4. Traffic Management Tests
In most cases, the network connection between two geographic In most cases, the network connection between two geographic
locations (branch offices, etc.) is lower than the network connection locations (branch offices, etc.) is lower than the network connection
to host computers. An example would be LAN connectivity of GigE to host computers. An example would be LAN connectivity of GigE
and WAN connectivity of 100 Mbps. The WAN connectivity may be and WAN connectivity of 100 Mbps. The WAN connectivity may be
physically 100 Mbps or logically 100 Mbps (over a GigE WAN physically 100 Mbps or logically 100 Mbps (over a GigE WAN
connection). In the later case, rate limiting is used to provide the connection). In the later case, rate limiting is used to provide the
WAN bandwidth per the SLA. WAN bandwidth per the SLA.
Traffic management techniques are employed to provide various forms Traffic management techniques might be employed and the most common
of QoS, the more common include: are:
- Traffic Shaping - Traffic Policing and/or Shaping
- Priority queuing - Priority queuing
- Random Early Discard (RED) - Active Queue Management (AQM)
Configuring the end-to-end network with these various traffic Configuring the end-to-end network with these various traffic
management mechanisms is a complex under-taking. For traffic shaping management mechanisms is a complex under-taking. For traffic shaping
and RED techniques, the end goal is to provide better performance to and AQM techniques, the end goal is to provide better performance to
bursty traffic such as TCP,(RED is specifically intended for TCP). bursty traffic.
This section of the methodology provides guidelines to test traffic This section of the methodology provides guidelines to test traffic
shaping and RED implementations. As in section 3.3, host hardware shaping and AQM implementations. As in section 3.3, host hardware
performance must be well understood before conducting the traffic performance must be well understood before conducting the traffic
shaping and RED tests. Dedicated communications test instrument will shaping and AQM tests. Dedicated communications test instrument will
generally be REQUIRED for line rates of GigE and 10 GigE. If the generally be REQUIRED for line rates of GigE and 10 GigE. If the
throughput test is expected to exceed 10% of the provider bandwidth, throughput test is expected to exceed 10% of the provider bandwidth,
then the test should be coordinated with the network provider. This then the test should be coordinated with the network provider. This
does not include the customer premises bandwidth, the 10% refers to does not include the customer premises bandwidth, the 10% refers to
the provider's bandwidth (Provider Edge to Provider router). Note the provider's bandwidth (Provider Edge to Provider router). Note
that GigE and 10 GigE interfaces might benefit from hold-queue that GigE and 10 GigE interfaces might benefit from hold-queue
adjustments in order to prevent the saw-tooth TCP traffic pattern. adjustments in order to prevent the saw-tooth TCP traffic pattern.
3.4.1 Traffic Shaping Tests 3.4.1 Traffic Shaping Tests
skipping to change at page 23, line 35 skipping to change at page 23, line 35
Simply stated, traffic policing marks and/or drops packets which Simply stated, traffic policing marks and/or drops packets which
exceed the SLA bandwidth (in most cases, excess traffic is dropped). exceed the SLA bandwidth (in most cases, excess traffic is dropped).
Traffic shaping employs the use of queues to smooth the bursty Traffic shaping employs the use of queues to smooth the bursty
traffic and then send out within the SLA bandwidth limit (without traffic and then send out within the SLA bandwidth limit (without
dropping packets unless the traffic shaping queue is exhausted). dropping packets unless the traffic shaping queue is exhausted).
Traffic shaping is generally configured for TCP data services and Traffic shaping is generally configured for TCP data services and
can provide improved TCP performance since the retransmissions are can provide improved TCP performance since the retransmissions are
reduced, which in turn optimizes TCP throughput for the available reduced, which in turn optimizes TCP throughput for the available
bandwidth. Through this section, the rate-limited bandwidth shall bandwidth. Throughout this section, the rate-limited bandwidth shall
be referred to as the "bottleneck bandwidth". be referred to as the "Bottleneck Bandwidth".
The ability to detect proper traffic shaping is more easily diagnosed The ability to detect proper traffic shaping is more easily diagnosed
when conducting a multiple TCP connections test. Proper shaping will when conducting a multiple TCP connections test. Proper shaping will
provide a fair distribution of the available bottleneck bandwidth, provide a fair distribution of the available Bottleneck Bandwidth,
while traffic policing will not. while traffic policing will not.
The traffic shaping tests are built upon the concepts of multiple The traffic shaping tests are built upon the concepts of multiple
connections testing as defined in section 3.3.3. Calculating the BDP connections testing as defined in section 3.3.3. Calculating the BDP
for the bottleneck bandwidth is first required before selecting the for the Bottleneck Bandwidth is first required before selecting the
number of connections, the Send Socket Buffer and maximum TCP RWIN number of connections, the Send Socket Buffer and TCP RWND Sizes per
Sizes per connection. connection.
Similar to the example in section 3.3, a typical test scenario might Similar to the example in section 3.3, a typical test scenario might
be: GigE LAN with a 100Mbps bottleneck bandwidth (rate limited be: GigE LAN with a 100Mbps Bottleneck Bandwidth (rate limited
logical interface), and 5 msec RTT. This would require five (5) TCP logical interface), and 5 msec RTT. This would require five (5) TCP
connections of 64 KB Send Socket Buffer and maximum TCP RWIN Sizes connections of 64 KB Send Socket Buffer and TCP RWND Sizes to evenly
to evenly fill the bottleneck bandwidth (~100 Mbps per connection). fill the Bottleneck Bandwidth (~100 Mbps per connection).
The traffic shaping test should be run over a long enough duration to The traffic shaping test should be run over a long enough duration to
properly exercise network buffers (greater than 30 seconds) and also properly exercise network buffers (i.e. greater than 30 seconds) and
characterize performance during different time periods of the day. should also characterize performance at different times of day. The
The throughput of each connection MUST be logged during the entire throughput of each connection MUST be logged during the entire test,
test, along with the TCP Transfer Time, TCP Efficiency, and along with the TCP Transfer Time, TCP Efficiency, and Buffer Delay
Buffer Delay Percentage. Percentage.
3.4.1.1 Interpretation of Traffic Shaping Test Results 3.4.1.1 Interpretation of Traffic Shaping Test Results
By plotting the throughput achieved by each TCP connection, we should By plotting the throughput achieved by each TCP connection, we should
see fair sharing of the bandwidth when traffic shaping is properly see fair sharing of the bandwidth when traffic shaping is properly
configured for the bottleneck interface. For the previous example of configured. For the previous example of 5 connections sharing 500
5 connections sharing 500 Mbps, each connection would consume Mbps, each connection would consume ~100 Mbps with smooth variations.
~100 Mbps with smooth variations.
When traffic shaping is not configured properly or if traffic If traffic shaping is not configured properly or if traffic policing
policing is present on the bottleneck interface, the bandwidth is present on the bottleneck interface, the bandwidth sharing may
sharing may not be fair. The resulting throughput plot may reveal not be fair. The resulting throughput plot may reveal "spikey"
"spikey" throughput consumption of the competing TCP connections (due throughput consumption of the competing TCP connections (due to the
to the high rate of TCP retransmissions). high rate of TCP retransmissions).
3.4.2 RED Tests 3.4.2 AQM Tests
Random Early Discard techniques are specifically targeted to provide Active Queue Management techniques are specifically targeted to
congestion avoidance for TCP traffic. Before the network element provide congestion avoidance to TCP traffic. As an example, before
queue "fills" and enters the tail drop state, RED drops packets at the network element queue "fills" and enters the tail drop state, an
configurable queue depth thresholds. This action causes TCP AQM implementation like RED (Random Early Discard) drops packets at
connections to back-off which helps to prevent tail drop, which in pre-configurable queue depth thresholds. This action causes TCP
turn helps to prevent global TCP synchronization. connections to back-off which helps prevent tail drops and in
turn helps avoid global TCP synchronization.
Again, rate limited interfaces may benefit greatly from RED based RED is just an example and other AQM implementations like WRED
techniques. Without RED, TCP may not be able to achieve the full (Weighted Random Early Discard) or REM (Random Exponential Marking)
bottleneck bandwidth. With RED enabled, TCP congestion avoidance or AREM (Adaptive Random Exponential Marking), just to name a few,
throttles the connections on the higher speed interface (i.e. LAN) could be used.
and can help achieve the full bottleneck bandwidth. The burstiness
of TCP traffic is a key factor in the overall effectiveness of RED
techniques; steady state bulk transfer flows will generally not
benefit from RED. With bulk transfer flows, network device queues
gracefully throttle the effective throughput rates due to increased
delays.
The ability to detect proper RED configuration is more easily Again, rate limited interfaces may benefit greatly from AQM based
techniques. With a default FIFO queue, bloated buffering is
increasingly a common encounter and has dire effects on TCP
connections. However, the main effect is the delayed congestion
feedback (poor TCP control loop response) and enormous queuing
delays on all other traffic flows.
In a FIFO based queue, the TCP traffic may not be able to achieve
the full throughput available on the Bottleneck Bandwidth link.
While with an AQM implementation, TCP congestion avoidance would
throttle the connections on the higher speed interface (i.e. LAN)
and could help achieve the full throughput (up to the Bottleneck
Bandwidth). The bursty nature of TCP traffic is a key factor in the
overall effectiveness of AQM techniques; steady state bulk transfer
flows will generally not benefit from AQM because with bulk transfer
flows, network device queues gracefully throttle the effective
throughput rates due to increased delays.
The ability to detect proper AQM configuration is more easily
diagnosed when conducting a multiple TCP connections test. Multiple diagnosed when conducting a multiple TCP connections test. Multiple
TCP connections provide the bursty sources that emulate the TCP connections provide the bursty sources that emulate the
real-world conditions for which RED was intended. real-world conditions for which AQM implementations are intended.
The RED tests also builds upon the concepts of multiple connections AQM testing also builds upon the concepts of multiple connections
testing as defined in section 3.3.3. Calculating the BDP for the testing as defined in section 3.3.3. Calculating the BDP for the
bottleneck bandwidth is first required before selecting the number Bottleneck Bandwidth is first required before selecting the number
of connections, the Send Socket Buffer size and the maximum TCP RWIN of connections, the Send Socket Buffer size and the TCP RWND Size
Size per connection. per connection.
For RED testing, the desired effect is to cause the TCP connections For AQM testing, the desired effect is to cause the TCP connections
to burst beyond the bottleneck bandwidth so that queue drops will to burst beyond the Bottleneck Bandwidth so that queue drops will
occur. Using the same example from section 3.4.1 (traffic shaping), occur. Using the same example from section 3.4.1 (traffic shaping),
the 500 Mbps bottleneck bandwidth requires 5 TCP connections (with the 500 Mbps Bottleneck Bandwidth requires 5 TCP connections (with
window size of 64KB) to fill the capacity. Some experimentation is window size of 64KB) to fill the capacity. Some experimentation is
required, but it is recommended to start with double the number of required, but it is recommended to start with double the number of
connections in order to stress the network element buffers / queues connections in order to stress the network element buffers / queues
(10 connections for this example). (10 connections for this example).
The TCP TTD must be configured to generate these connections as The TCP TTD must be configured to generate these connections as
shorter (bursty) flows versus bulk transfer type flows. These TCP shorter (bursty) flows versus bulk transfer type flows. These TCP
bursts should stress queue sizes in the 512KB range. Again bursts should stress queue sizes in the 512KB range. Again
experimentation will be required; the proper number of TCP experimentation will be required; the proper number of TCP
connections, the Send Socket Buffer and maximum TCP RWIN Sizes will connections, the Send Socket Buffer and TCP RWND Sizes will be
be dictated by the size of the network element queue. dictated by the size of the network element queue.
3.4.2.1 Interpretation of RED Results 3.4.2.1 Interpretation of AQM Results
The default queuing technique for most network devices is FIFO based. The default queuing technique for most network devices is FIFO based.
Without RED, the FIFO based queue may cause excessive loss to all of Under heavy traffic conditions, FIFO based queue management may cause
the TCP connections and in the worst case global TCP synchronization. enormous queuing delays plus delayed congestion feedback to all TCP
applications. This can cause excessive loss on all of the TCP
connections and in the worst cases, global TCP synchronization.
By plotting the aggregate throughput achieved on the bottleneck AQM implementation can be detected by plotting individual and
interface, proper RED operation may be determined if the bottleneck aggregate throughput results achieved by multiple TCP connections on
bandwidth is fully utilized. For the previous example of 10 the bottleneck interface. Proper AQM operation may be determined if
connections (window = 64 KB) sharing 500 Mbps, each connection should the TCP throughput is fully utilized (up to the Bottleneck Bandwidth)
consume ~50 Mbps. If RED was not properly enabled on the interface, and fairly shared between TCP connections. For the previous example
then the TCP connections will retransmit at a higher rate and the of 10 connections (window = 64 KB) sharing 500 Mbps, each connection
net effect is that the bottleneck bandwidth is not fully utilized. should consume ~50 Mbps. If AQM was not properly enabled on the
interface, then the TCP connections would retransmit at higher rates
and the net effect is that the Bottleneck Bandwidth is not fully
utilized.
Another means to study non-RED versus RED implementations is to use Another means to study non-AQM versus AQM implementations is to use
the TCP Transfer Time metric for all of the connections. In this the Buffer Delay Percent metric for all of the connections. The
example, a 100 MB payload transfer should take ideally 16 seconds Buffer Delay Percentage should be significantly lower in AQM
across all 10 connections (with RED enabled). With RED not enabled, implementations versus default FIFO queuing.
the throughput across the bottleneck bandwidth may be greatly
reduced (generally 10-20%) and the actual TCP Transfer time may be
proportionally longer then the Ideal TCP Transfer time.
Additionally, non-RED implementations may exhibit a lower TCP Additionally, non-AQM implementations may exhibit a lower TCP
Transfer Efficiency. Transfer Efficiency.
4. Security Considerations 4. Security Considerations
The security considerations that apply to any active measurement of The security considerations that apply to any active measurement of
live networks are relevant here as well. See [RFC4656] and live networks are relevant here as well. See [RFC4656] and
[RFC5357]. [RFC5357].
5. IANA Considerations 5. IANA Considerations
 End of changes. 147 change blocks. 
393 lines changed or deleted 411 lines changed or added

This html diff was produced by rfcdiff 1.40. The latest version is available from http://tools.ietf.org/tools/rfcdiff/