draft-ietf-rddp-applicability-04.txt   draft-ietf-rddp-applicability-05.txt 
Remote Direct Data Placement C. Bestler Remote Direct Data Placement C. Bestler
Working group Broadcom Working group Broadcom
Internet-Draft L. Coene Internet-Draft L. Coene
Expires: April 14, 2006 Siemens Expires: June 8, 2006 Siemens
October 11, 2005 December 5, 2005
Applicability of Remote Direct Memory Access Protocol (RDMA) and Direct Applicability of Remote Direct Memory Access Protocol (RDMA) and Direct
Data Placement (DDP) Data Placement (DDP)
draft-ietf-rddp-applicability-04.txt draft-ietf-rddp-applicability-05.txt
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 36 skipping to change at page 1, line 36
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on April 14, 2006. This Internet-Draft will expire on June 8, 2006.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (2005). Copyright (C) The Internet Society (2005).
Abstract Abstract
This document describes the applicability of Remote Direct Memory This document describes the applicability of Remote Direct Memory
Access Protocol (RDMAP) and the Direct Data Placement Protocol (DDP). Access Protocol (RDMAP) and the Direct Data Placement Protocol (DDP).
It comparese and contrasts the different transport options over IP It comparese and contrasts the different transport options over IP
skipping to change at page 2, line 16 skipping to change at page 2, line 16
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 5 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Direct Placement . . . . . . . . . . . . . . . . . . . . . . . 6 3. Direct Placement . . . . . . . . . . . . . . . . . . . . . . . 6
3.1. Fewer Required ULP Interactions . . . . . . . . . . . . . 6 3.1. Fewer Required ULP Interactions . . . . . . . . . . . . . 6
3.2. Direct Placement using only the LLP . . . . . . . . . . . 6 3.2. Direct Placement using only the LLP . . . . . . . . . . . 6
4. Tagged Messages . . . . . . . . . . . . . . . . . . . . . . . 8 4. Tagged Messages . . . . . . . . . . . . . . . . . . . . . . . 8
4.1. Order Independent Reception . . . . . . . . . . . . . . . 8 4.1. Order Independent Reception . . . . . . . . . . . . . . . 8
4.2. Reduced ULP Notifications . . . . . . . . . . . . . . . . 8 4.2. Reduced ULP Notifications . . . . . . . . . . . . . . . . 9
4.3. Simplified ULP Exchanges . . . . . . . . . . . . . . . . . 9 4.3. Simplified ULP Exchanges . . . . . . . . . . . . . . . . . 9
4.4. Order Independent Sending . . . . . . . . . . . . . . . . 10 4.4. Order Independent Sending . . . . . . . . . . . . . . . . 11
4.5. Tagged Buffers as ULP Credits . . . . . . . . . . . . . . 11 4.5. Untagged Messages and Tagged Buffers as ULP Credits . . . 12
5. RDMA Read . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5. RDMA Read . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6. LLP Comparisons . . . . . . . . . . . . . . . . . . . . . . . 14 6. LLP Comparisons . . . . . . . . . . . . . . . . . . . . . . . 15
6.1. Multistreaming Implications . . . . . . . . . . . . . . . 14 6.1. Multistreaming Implications . . . . . . . . . . . . . . . 15
6.2. Out of Order Reception Implications . . . . . . . . . . . 14 6.2. Out of Order Reception Implications . . . . . . . . . . . 15
6.3. Header and Marker Overhead . . . . . . . . . . . . . . . . 14 6.3. Header and Marker Overhead . . . . . . . . . . . . . . . . 15
6.4. Middlebox Support . . . . . . . . . . . . . . . . . . . . 15 6.4. Middlebox Support . . . . . . . . . . . . . . . . . . . . 15
6.5. Processing Overhead . . . . . . . . . . . . . . . . . . . 15 6.5. Processing Overhead . . . . . . . . . . . . . . . . . . . 16
6.6. Data Integrity Implications . . . . . . . . . . . . . . . 15 6.6. Data Integrity Implications . . . . . . . . . . . . . . . 16
6.6.1. MPA/TCP Specifics . . . . . . . . . . . . . . . . . . 15 6.6.1. MPA/TCP Specifics . . . . . . . . . . . . . . . . . . 16
6.6.2. SCTP Specifics . . . . . . . . . . . . . . . . . . . . 16 6.6.2. SCTP Specifics . . . . . . . . . . . . . . . . . . . . 17
6.7. Non-IP Transports . . . . . . . . . . . . . . . . . . . . 16 6.7. Non-IP Transports . . . . . . . . . . . . . . . . . . . . 17
6.7.1. No RDMA Layer Ack . . . . . . . . . . . . . . . . . . 16 6.7.1. No RDMA Layer Ack . . . . . . . . . . . . . . . . . . 17
6.8. Other IP Transports . . . . . . . . . . . . . . . . . . . 17 6.8. Other IP Transports . . . . . . . . . . . . . . . . . . . 18
6.9. LLP Independent Session Establishment . . . . . . . . . . 17 6.9. LLP Independent Session Establishment . . . . . . . . . . 18
6.9.1. RDMA-only Session Establishment . . . . . . . . . . . 18 6.9.1. RDMA-only Session Establishment . . . . . . . . . . . 19
6.9.2. RDMA-Conditional Session Establishment . . . . . . . . 18 6.9.2. RDMA-Conditional Session Establishment . . . . . . . . 19
7. Local Interface Implications . . . . . . . . . . . . . . . . . 20 7. Local Interface Implications . . . . . . . . . . . . . . . . . 21
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22
9. Security considerations . . . . . . . . . . . . . . . . . . . 22 9. Security considerations . . . . . . . . . . . . . . . . . . . 23
9.1. Connection/Association Setup . . . . . . . . . . . . . . . 22 9.1. Connection/Association Setup . . . . . . . . . . . . . . . 23
9.2. Tagged Buffer Exposure . . . . . . . . . . . . . . . . . . 22 9.2. Tagged Buffer Exposure . . . . . . . . . . . . . . . . . . 23
9.3. Impact of Encrypted Transports . . . . . . . . . . . . . . 22 9.3. Impact of Encrypted Transports . . . . . . . . . . . . . . 23
10. Normative references . . . . . . . . . . . . . . . . . . . . . 23 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24 10.1. Normative references . . . . . . . . . . . . . . . . . . . 24
Intellectual Property and Copyright Statements . . . . . . . . . . 25 10.2. Informative References . . . . . . . . . . . . . . . . . . 24
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25
Intellectual Property and Copyright Statements . . . . . . . . . . 26
1. Introduction 1. Introduction
Remote Direct Memory Access Protocol (RDMAP) and Direct Data Remote Direct Memory Access Protocol (RDMAP) and Direct Data
Placement (DDP) work together to provide application independent Placement (DDP) work together to provide application independent
efficient placement of application payload directly into buffers efficient placement of application payload directly into buffers
specified by the Upper Layer Protocol (ULP). specified by the Upper Layer Protocol (ULP).
The DDP protocol is responsible for direct placement of received The DDP protocol is responsible for direct placement of received
payload into ULP specified buffers. The RDMAP protocol provides payload into ULP specified buffers. The RDMAP protocol provides
skipping to change at page 3, line 38 skipping to change at page 3, line 38
o The existence of an application independent protocol allows common o The existence of an application independent protocol allows common
solutions to be implemented in hardware and/or the kernel. This solutions to be implemented in hardware and/or the kernel. This
document will discuss when common data placement procedures are of document will discuss when common data placement procedures are of
the greatest benefit to applications as contrasted with the greatest benefit to applications as contrasted with
application specific solutions built on top of direct use of the application specific solutions built on top of direct use of the
underlying transport. underlying transport.
o DDP supports both untagged and tagged buffers. Tagged buffers o DDP supports both untagged and tagged buffers. Tagged buffers
allow the Data Sink ULP to be indifferent to what order (or in allow the Data Sink ULP to be indifferent to what order (or in
what packets) the Data Source sent the data, or what order they what messages) the Data Source sent the data, or what order
are received in. This document will discuss when Data Source packets are received in. Typically tagged data can be used for
flexibility is of benefit to applications. payload transfer, while untagged is best used for control
messages. However each upper layer protocol can determine the
optimal use of tagged and untagged messages for itself. This
document will discuss when Data Source flexibility is of benefit
to applications.
o RDMAP consolidates ULP notifications, thereby minimizing the o RDMAP consolidates ULP notifications, thereby minimizing the
number of required ULP interactions. number of required ULP interactions.
o RDMAP defines RDMA Reads, which allow remote access to advertised o RDMAP defines RDMA Reads, which allow remote access to advertised
buffers. This document will review the advantages of using RDMA buffers. This document will review the advantages of using RDMA
Reads as contrasted to alternate solutions. Reads as contrasted to alternate solutions.
Some non-IP transports, such as InfiniBand, directly integrate RDMA Some non-IP transports, such as InfiniBand, directly integrate RDMA
features. This document will review the applicability of providing features. This document will review the applicability of providing
skipping to change at page 5, line 35 skipping to change at page 5, line 35
Lower Layer Protocol (LLP) The transport protocol that provides Lower Layer Protocol (LLP) The transport protocol that provides
services to DDP. This is an IP transport with any required services to DDP. This is an IP transport with any required
adaptation layer. Adaptation layers are defined for SCTP and TCP. adaptation layer. Adaptation layers are defined for SCTP and TCP.
Steering Tag (STag) An identifier of a Tagged Buffer on a Node, valid Steering Tag (STag) An identifier of a Tagged Buffer on a Node, valid
as defined within a protocol specification. as defined within a protocol specification.
Tagged Message A DDP message that is directed to a ULP specified Tagged Message A DDP message that is directed to a ULP specified
buffer based upon imbedded addressing information. In the buffer based upon imbedded addressing information. In the
immediate sense, the destination buffer is specified by the immediate sense, the destination buffer is specified by the
message sender. message sender. The message receiver is given no independent
indication that a tagged message has been received.
Untagged Message A DDP message that is directed to a ULP specified Untagged Message A DDP message that is directed to a ULP specified
buffer based upon a Message Sequence Number being matched with a buffer based upon a Message Sequence Number being matched with a
receiver supplied buffer. The destination buffer is specified by receiver supplied buffer. The destination buffer is specified by
the message receiver. the message receiver. The message receiver is notified by some
mechanism that an untagged message has been received.
Upper Layer Protocol (ULP) The direct user of RDMAP/DDP services. Upper Layer Protocol (ULP) The direct user of RDMAP/DDP services. In
This may be an application, or a middleware layer such as Sockets addition to protocols such as iSER [10] and NFSv4 over RDMA [11],
Direct Protocol (SDP) or Remote Procedure Calls (RPC). the ULP may be embedded in an application, or a middleware layer
as is often the case for the Sockets Direct Protocol (SDP) and
Remote Procedure Call (RPC) protocols.
3. Direct Placement 3. Direct Placement
Direct Data Placement optimizes the placement of ULP payload into the Direct Data Placement optimizes the placement of ULP payload into the
correct destination buffers, typically eliminating intermediate correct destination buffers, typically eliminating intermediate
copying. Placement is enabled without regard to order of arrival, copying. Placement is enabled without regard to order of arrival,
order of transmission or requiring per-placement interaction with the order of transmission or requiring per-placement interaction with the
ULP. ULP.
RDMAP minimizes the required ULP interactions . This capability is RDMAP minimizes the required ULP interactions . This capability is
skipping to change at page 7, line 5 skipping to change at page 7, line 5
nothing more than direct placement into buffers might be able to do nothing more than direct placement into buffers might be able to do
so with a properly designed local interface to SCTP or TCP. Doing so so with a properly designed local interface to SCTP or TCP. Doing so
for TCP requires making predictions at a byte level rather than a for TCP requires making predictions at a byte level rather than a
message level. message level.
The main benefit of DDP for such an application would be that pre- The main benefit of DDP for such an application would be that pre-
posting of receive buffers is a mandated local interface capability, posting of receive buffers is a mandated local interface capability,
and that predictions can be made on a per-message basis (not per and that predictions can be made on a per-message basis (not per
byte). byte).
The LLP can also be used directly if ULP specific knowledge is built The Lower Layer Protocol, LLP, can also be used directly if ULP
into the protocol stack to allow "parse and place" handling of specific knowledge is built into the protocol stack to allow "parse
received packets. Such a solution either requires interaction with and place" handling of received packets. Such a solution either
the ULP, or that the protocol stack have knowledge of ULP specific requires interaction with the ULP, or that the protocol stack have
syntax rules. knowledge of ULP specific syntax rules.
DDP achieves the benefits of directly placing incoming payload DDP achieves the benefits of directly placing incoming payload
without requiring tight coupling between the ULP and the protocol without requiring tight coupling between the ULP and the protocol
stack. However, "parse and place" capabilities can certainly provide stack. However, "parse and place" capabilities can certainly provide
equivalent services to a limited number of ULPs. equivalent services to a limited number of ULPs.
4. Tagged Messages 4. Tagged Messages
This section covers the major benefits from the use of Tagged This section covers the major benefits from the use of Tagged
Messages. Messages.
A more critical advantage of DDP is the ability of the Data Source to A more critical advantage of DDP is the ability of the Data Source to
use tagged buffers. Tagging messages allows the Data Source to use tagged buffers. Tagging messages allows the Data Source to
choose the ordering and packetization of its payload deliveries. choose the ordering and packetization of its payload deliveries.
With direct data placement based solely upon pre-posted receives, the With direct data placement based solely upon pre-posted receives, the
packetization and delivery of payload must be agreed by the ULP peers packetization and delivery of payload must be agreed by the ULP peers
in advance. Even if there is an encoding of what is being in advance.
transferred, as is common with middleware solutions, this information
is not understood at the application independent layers. The The Upper Layer Protocol can allocate content between untagged and/or
directions on where to place the incoming data cannot be accessed tagged messages to maximize the potential optimizations. Placing
without switching to the ULP first. DDP provides a standardized content within an untagged message can deliver the content in the
'packing list' which can be interpreted without requiring ULP same packet that signals completion to the receiver. This can
interaction. Indeed, it is designed to be implementable in hardware. improve latency. It can even eliminate round trips. But it requires
making larger anonymous buffers to be available.
Some examples of data that typically belongs in the untagged message
would include short fixed size control data that is inherently part
of the control message almost always should be included in the
untagged message, relatively short payload that is almost always
needed (especially when it would eliminate a round-trip to fetch the
data. For example, the initial data on a write request, and of
course advertising tagged buffers that specify the location of data
not in the untagged message.
Tagged messages standardizes direct placemtn of data without per-
packet interaction with the upper layers. Even if there is an upper
layer protocol encoding of what is being transferred, as is common
with middleware solutions, this information is not understood at the
application independent layers. The directions on where to place the
incoming data cannot be accessed without switching to the ULP first.
DDP provides a standardized\ 'packing list' which can be interpreted
without requiring ULP interaction. Indeed, it is designed to be
implementable in hardware.
4.1. Order Independent Reception 4.1. Order Independent Reception
Tagged messages are directed to a buffer based on an included Tagged messages are directed to a buffer based on an included
Steering Tag. Additionally, no notice is provided to the ULP for each Steering Tag. Additionally, no notice is provided to the ULP for each
individual Tagged Message's arrival. Together these allow tagged individual Tagged Message's arrival. Together these allow tagged
messages received out-of-order to be processed without intermediate messages received out-of-order to be processed without intermediate
buffering or additional notifications to the ULP. buffering or additional notifications to the ULP.
4.2. Reduced ULP Notifications 4.2. Reduced ULP Notifications
RDMAP offers both tagged and untagged messages. No receiving side
ULP interactions are required for tagged messages. By optimally
dividing traffic between tagged and untagged messages the ULP can
limit the number of events that must be dealt with at the ULP layer.
This typically reduces the number of context switches required and
improves performance.
RDMAP further reduces required ULP interactions consolidating RDMAP further reduces required ULP interactions consolidating
completion notifications of tagged messages with the completion completion notifications of tagged messages with the completion
notification of a trailing untagged message. For most ULPs this notification of a trailing untagged message. For most ULPs this
radically reduces the number of ULP required interactions even radically reduces the number of ULP required interactions even
further. further.
While RDMAP consolidation of notices is beneficial to most While RDMAP consolidation of notices is beneficial to most
applications, it may be detrimental to some applications that benefit applications, it may be detrimental to some applications that benefit
from streamed delivery to enable ULP processing of received data as from streamed delivery to enable ULP processing of received data as
promptly as possible. A ULP that uses RDMAP cannot begin processing promptly as possible. A ULP that uses RDMAP cannot begin processing
skipping to change at page 9, line 37 skipping to change at page 10, line 18
A ULP where all exchanges would naturally be only the untagged A ULP where all exchanges would naturally be only the untagged
message would derive virtually no benefit from the use of RDMAP/DDP message would derive virtually no benefit from the use of RDMAP/DDP
as opposed to SCTP. But while tagged buffers are the justification as opposed to SCTP. But while tagged buffers are the justification
for RDMAP/DDP, untagged buffers are still necessary. Without for RDMAP/DDP, untagged buffers are still necessary. Without
untagged buffers the only method to exchange buffer advertisements untagged buffers the only method to exchange buffer advertisements
would involve out-of-band communications and/or sharing of compile would involve out-of-band communications and/or sharing of compile
time constants. Most RDMA-aware ULPs use untagged buffers for time constants. Most RDMA-aware ULPs use untagged buffers for
requests and responses. Buffer advertisements are typically done requests and responses. Buffer advertisements are typically done
within these untagged messages. within these untagged messages.
More importantly there would be no reliable method for the upper
layer peers to synchronize. The absence of any guarantees about
ordering within or between tagged messages is fundamental to allowing
the DDP layer to optimize transfer of tagged payload.
So no ULP can be defined entirely in terms of tagged messages.
Eventually a notification that confirms delivery must be generated
from the RDMAP/DDP layer.
Limiting use of untagged buffers to requests and responses by moving Limiting use of untagged buffers to requests and responses by moving
all bulk data using tagged transfers can greatly simplify the amount all bulk data using tagged transfers can greatly simplify the amount
of prediction that the Data Sink must perform in pre-posting receive of prediction that the Data Sink must perform in pre-posting receive
buffers. For example, a typical RDMA enabled interaction would buffers. For example, a typical RDMA enabled interaction would
consist of the following: consist of the following:
Client sends transaction request to server's as an untagged Client sends transaction request to server's as an untagged
message. message.
This message includes buffer advertisements for the buffers where This message includes buffer advertisements for the buffers where
skipping to change at page 11, line 6 skipping to change at page 11, line 44
copied in an end-to-end transfer. copied in an end-to-end transfer.
There are numerous reasons why the Data Sink would not know the true There are numerous reasons why the Data Sink would not know the true
order or location of the requested data. It could be different for order or location of the requested data. It could be different for
each client, different records selected and/or different sort orders, each client, different records selected and/or different sort orders,
RAID striping, file fragmentation, volume fragmentation, volume RAID striping, file fragmentation, volume fragmentation, volume
mirroring and server-side dynamic compositing of content (such as mirroring and server-side dynamic compositing of content (such as
server side includes for HTTP). server side includes for HTTP).
In all of these cases the Data Source is free to assemble the desired In all of these cases the Data Source is free to assemble the desired
data in the Data Sinks buffer in whatever order the component data data in the Data Sink's buffer in whatever order the component data
becomes available to it. It is not constrained on ordering. It does becomes available to it. It is not constrained on ordering. It does
not have to assemble an image in its own memory before creating it in not have to assemble an image in its own memory before creating it in
the Data Sink's buffers. the Data Sink's buffers.
Note that while DDP enables use of tagged messages for bulk transfer, Note that while DDP enables use of tagged messages for bulk transfer,
there are some application scenarios where untagged messages would there are some application scenarios where untagged messages would
still be used for bulk transfer. For example, under the Direct still be used for bulk transfer. For example, a file server may not
Access File Server (DAFS) protocol the file server does not expose expose its own memory to its clients. A client wishing to write may
its own memory to its clients. A client wishing to write may
advertise a buffer which the server will issue RDMA Reads upon. advertise a buffer which the server will issue RDMA Reads upon.
However, when performing a small write it may be preferable to However, when performing a small write it may be preferable to
include the data in the untagged message rather than incurring an include the data in the untagged message rather than incurring an
additional round trip with the RDMA Read and its response. additional round trip with the RDMA Read and its response.
4.5. Tagged Buffers as ULP Credits Generally, the best use of an untagged message is to synchronize and
to deliver data that is naturally tied to the same message as the
synchronization. For initial data transfers this has the additional
benefit of avoiding the need to advertise specific tagged buffers for
indefinite time periods. Instead anonymous buffers can be used for
initial data reception. Because anonymous buffers do not need to be
tied to specific messages in advance this can be a major benefit.
4.5. Untagged Messages and Tagged Buffers as ULP Credits
The handling of end-to-end buffer credits differs considerably with The handling of end-to-end buffer credits differs considerably with
DDP than when the ULP directly uses either TCP or SCTP. DDP than when the ULP directly uses either TCP or SCTP.
With both TCP and SCTP buffer credits are based upon the receiver With both TCP and SCTP buffer credits are based upon the receiver
granting transmit permission based on the total number of bytes. granting transmit permission based on the total number of bytes.
These credits reflect system buffering resources and/or simple flow These credits reflect system buffering resources and/or simple flow
control. They do not represent ULP resources. control. They do not represent ULP resources.
DDP defines no standard flow control, but presumes the existince of a DDP defines no standard flow control, but presumes the existince of a
ULP mechanism. The presumed mechanism is that the Data Sink ULP has ULP mechanism. The presumed mechanism is that the Data Sink ULP has
issued credits to the Data Source allowing the Data Source to send a issued credits to the Data Source allowing the Data Source to send a
specific number of untagged messages. specific number of untagged messages.
The ULP peers must ensure that the sender is aware of the maximum The ULP peers must ensure that the sender is aware of the maximum
size that can be sent to any specific target buffer. One method of size that can be sent to any specific target buffer. One method of
doing so is to use a standard size for all untagged buffers within a doing so is to use a standard size for all untagged buffers within a
given connection. For example, DAFS specifies an initial size given connection. For example, a ULP may specify an initial untagged
requirement for session establishment, during which the untagged buffer size to be used immediately after session establishment, and
buffer size for the remainder of the session is negotiated. then optionally specify mechanisms for negotiating changes.
Tagged buffers are ULP resources advertised directly from ULP to ULP. Tagged buffers are ULP resources advertised directly from ULP to ULP.
A DDP put to a known tagged buffer is constrained only by transport A DDP put to a known tagged buffer is constrained only by transport
level flow control, not by available system buffering. level flow control, not by available system buffering.
Either tagged or untagged buffers allows bypassing of system buffer Either tagged or untagged buffers allows bypassing of system buffer
resources. Use of tagged buffers additionally allows the Data Source resources. Use of tagged buffers additionally allows the Data Source
to choose what order to exercise the credits in. to choose what order to exercise the credits in.
To the extent allowed by the ULP, tagged buffers are also divisible To the extent allowed by the ULP, tagged buffers are also divisible
skipping to change at page 13, line 45 skipping to change at page 14, line 45
This is applicable for many applications that publish semi-volatile This is applicable for many applications that publish semi-volatile
data that does not require transactional validity checking (i.e., data that does not require transactional validity checking (i.e.,
authorized users have read access to the entire set of data). It is authorized users have read access to the entire set of data). It is
less applicable when there are ULP consistency checks that must be less applicable when there are ULP consistency checks that must be
performed upon the data. Such applications would be better served by performed upon the data. Such applications would be better served by
having the client send a request, and having the server use RDMA having the client send a request, and having the server use RDMA
Writes to publish the requested data. Neither RDMAP or DDP provide Writes to publish the requested data. Neither RDMAP or DDP provide
mechanisms for bundling multiple disjoint updates into an atomic mechanisms for bundling multiple disjoint updates into an atomic
operation. Therefore use of an advertised buffer as a data resource operation. Therefore use of an advertised buffer as a data resource
is subject to the same caveats as any randomly updated data resource, is subject to the same caveats as any randomly updated data resource,
such as flat files, that do not enforce their own cosnsistency. such as flat files, that do not enforce their own consistency.
6. LLP Comparisons 6. LLP Comparisons
Normally the choice of underlying IP transport is irrelevant to the Normally the choice of underlying IP transport is irrelevant to the
ULP. RDMAP and DDP provides the same services over either. There ULP. RDMAP and DDP provides the same services over either. There
may be performance impacts of the choice, however. It is the may be performance impacts of the choice, however. It is the
responsibility of the ULP to determine which IP transport is best responsibility of the ULP to determine which IP transport is best
suited to its needs. suited to its needs.
SCTP provides for preservation of message boundaries. Each DDP SCTP provides for preservation of message boundaries. Each DDP
segment will be delivered within a single SCTP packet. The segment will be delivered within a single SCTP packet. The
equivalent services are only available with TCP through the use of equivalent services are only available with TCP through the use of
the MPA adaptation layer. the MPA (Marker PDU Alignment) adaptation layer.
6.1. Multistreaming Implications 6.1. Multistreaming Implications
SCTP also provides multi-streaming. When the same pair of hosts have SCTP also provides multi-streaming. When the same pair of hosts have
need for multiple DDP streams this can be a major advantage. A need for multiple DDP streams this can be a major advantage. A
single SCTP association carries multiple DDP streams, consolidating single SCTP association carries multiple DDP streams, consolidating
connection setup, congestion control and acknowledgements. connection setup, congestion control and acknowledgements.
Completions are controlled by the DDP Source Sequence Number (DDP- Completions are controlled by the DDP Source Sequence Number (DDP-
SSN) on a per stream basis. Therefore combining multiple DDP Streams SSN) on a per stream basis. Therefore combining multiple DDP Streams
skipping to change at page 14, line 47 skipping to change at page 15, line 47
guaranteed, but certainly allowed. The ability of the MPA receiver guaranteed, but certainly allowed. The ability of the MPA receiver
to process out-of-order DDP Segments may be impaired when alignment to process out-of-order DDP Segments may be impaired when alignment
of TCP segments and MPA FPDUs is lost. Using SCTP, each DDP Segment of TCP segments and MPA FPDUs is lost. Using SCTP, each DDP Segment
is encoded in a single Data Chunk and never spread over multiple IP is encoded in a single Data Chunk and never spread over multiple IP
datagrams. datagrams.
6.3. Header and Marker Overhead 6.3. Header and Marker Overhead
MPA and TCP headers together are smaller than the headers used by MPA and TCP headers together are smaller than the headers used by
SCTP and its adaptation layer. However, this advantage can be SCTP and its adaptation layer. However, this advantage can be
considerably reduced by the insertion of MPA markers. In any event reduced by the insertion of MPA markers. The different in ULP
the different in ULP payload per IP Datagram is not likely to be a payload per IP Datagram is not likely to be a signifigant factor.
signifigant factor.
6.4. Middlebox Support 6.4. Middlebox Support
Even with the MPA adaptation layer, DDP traffic carried over MPA/TCP Even with the MPA adaptation layer, DDP traffic carried over MPA/TCP
will appear to all network middleboxes as a normal TCP connection. will appear to all network middleboxes as a normal TCP connection.
In many environments there may be a requirement to use only TCP In many environments there may be a requirement to use only TCP
connections to satisfy existing network elements and/or to facilitate connections to satisfy existing network elements and/or to facilitate
monitoring and control of connections. While SCTP is certainly just monitoring and control of connections. While SCTP is certainly just
as monitorable and controllable as TCP, there is no guarantee that as monitorable and controllable as TCP, there is no guarantee that
the network management infrastructure has the required support for the network management infrastructure has the required support for
both. both.
6.5. Processing Overhead 6.5. Processing Overhead
A DDP stream delivered via MPA/TCP will required more processing A DDP stream delivered via MPA/TCP will require more processing
effort that one delivered over SCTP. However this extra work may be effort that one delivered over SCTP. However this extra work may be
justified for many deployments where full SCTP support is unavailable justified for many deployments where full SCTP support is unavailable
in the endpoints of the network, or where middleboxes impair the in the endpoints of the network, or where middleboxes impair the
usability of SCTP. usability of SCTP.
6.6. Data Integrity Implications 6.6. Data Integrity Implications
Both the SCTP and MPA/TCP adaptation provide end-to-end CRC32c Both the SCTP and MPA/TCP adaptation provide end-to-end CRC32c
protection against data corruption, or its equivalent. protection against data corruption, or its equivalent.
skipping to change at page 18, line 18 skipping to change at page 19, line 16
It is also possible to allow for transport neutral establishment of It is also possible to allow for transport neutral establishment of
RDMAP/DDP sessions between endpoints. Combined, these two features RDMAP/DDP sessions between endpoints. Combined, these two features
would allow most applications to be unconcerned as to which LLP was would allow most applications to be unconcerned as to which LLP was
actually in use. actually in use.
Specifically, the procedures for DDP Stream Session establishment Specifically, the procedures for DDP Stream Session establishment
discussed in section 3 of the SCTP mapping, and section 13.3 of the discussed in section 3 of the SCTP mapping, and section 13.3 of the
MPA/TCP mapping, both allow for the exchange of ULP specific data MPA/TCP mapping, both allow for the exchange of ULP specific data
("Private Data") before enabling the exchange of DDP Segments. This ("Private Data") before enabling the exchange of DDP Segments. This
delays can allow for proper selection and/or configuration of the delay can allow for proper selection and/or configuration of the
endpoints based upon the exchanged data. For example, each DDP endpoints based upon the exchanged data. For example, each DDP
Stream Session associated with a single client session might be Stream Session associated with a single client session might be
assigned to the same DDP Protection Domain. assigned to the same DDP Protection Domain.
To be transport neutral, the applications should exchange Private To be transport neutral, the applications should exchange Private
Data as part of session establishment messages to determine how the Data as part of session establishment messages to determine how the
RDMA endpoints are to be configured. One side must be the Initiator, RDMA endpoints are to be configured. One side must be the Initiator,
and the other the Responder. and the other the Responder.
With SCTP, a pair of SCTP streams can be used for sequential With SCTP, a pair of SCTP streams can be used for sequential
skipping to change at page 23, line 5 skipping to change at page 24, line 5
placement purposes. IPsec tunnel mode encrypts entire IP Datagrams. placement purposes. IPsec tunnel mode encrypts entire IP Datagrams.
IPsec transport mode encrypts TCP Segments or SCTP packets. In IPsec transport mode encrypts TCP Segments or SCTP packets. In
neither case should IPsec preclude providing out-of-order DDP neither case should IPsec preclude providing out-of-order DDP
Segments to the DDP layer for placement. Segments to the DDP layer for placement.
Note that end-to-end use of IPsec cryptographic integrity protection Note that end-to-end use of IPsec cryptographic integrity protection
may allow suppression of MPA CRC generation and checking under may allow suppression of MPA CRC generation and checking under
certain circumstances. This is one example where the LLP may be certain circumstances. This is one example where the LLP may be
judged to have "or equivalent" protection to an end-to-end CRC32c. judged to have "or equivalent" protection to an end-to-end CRC32c.
10. Normative references 10. References
10.1. Normative references
[1] Bradner, S., "Key words for use in RFCs to Indicate Requirement [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement
Levels", BCP 14, RFC 2119, March 1997. Levels", BCP 14, RFC 2119, March 1997.
[2] Dierks, T. and C. Allen, "The TLS Protocol Version 1.0", [2] Dierks, T. and C. Allen, "The TLS Protocol Version 1.0",
RFC 2246, January 1999. RFC 2246, January 1999.
[3] Kent, S. and R. Atkinson, "IP Encapsulating Security Payload [3] Kent, S. and R. Atkinson, "IP Encapsulating Security Payload
(ESP)", RFC 2406, November 1998. (ESP)", RFC 2406, November 1998.
skipping to change at page 24, line 5 skipping to change at page 24, line 39
draft-ietf-rddp-ddp-05 (work in progress), July 2005. draft-ietf-rddp-ddp-05 (work in progress), July 2005.
[8] Stewart, R., "Stream Control Transmission Protocol (SCTP) Remote [8] Stewart, R., "Stream Control Transmission Protocol (SCTP) Remote
Direct Memory Access (RDMA) Direct Data Placement (DDP) Direct Memory Access (RDMA) Direct Data Placement (DDP)
Adaptationn", draft-ietf-rddp-sctp-02 (work in progress), Adaptationn", draft-ietf-rddp-sctp-02 (work in progress),
August 2005. August 2005.
[9] Culley, P., "Marker PDU Aligned Framing for TCP Specification", [9] Culley, P., "Marker PDU Aligned Framing for TCP Specification",
draft-ietf-rddp-mpa-02 (work in progress), February 2005. draft-ietf-rddp-mpa-02 (work in progress), February 2005.
10.2. Informative References
[10] Ko, M., "iSCSI Extensions for RDMA Specification",
October 2005.
[11] Callaghan, B. and T. Talpey, "NFS Direct Data Placemetn",
draft-ietf-nfsv4-nfsdirect-02 (work in progress), October 2005.
Authors' Addresses Authors' Addresses
Caitlin Bestler Caitlin Bestler
Broadcom Broadcom
49 Discovery 49 Discovery
Irvine, CA 92618 Irvine, CA 92618
USA USA
Phone: 949-926-6383 Phone: 949-926-6383
Email: caitlinb@broadcom.com Email: caitlinb@broadcom.com
 End of changes. 25 change blocks. 
67 lines changed or deleted 129 lines changed or added

This html diff was produced by rfcdiff 1.27, available from http://www.levkowetz.com/ietf/tools/rfcdiff/