draft-ietf-rddp-arch-06.txt   draft-ietf-rddp-arch-07.txt 
Internet-Draft Stephen Bailey (Sandburst) Internet-Draft Stephen Bailey (Sandburst)
Expires: April 2005 Tom Talpey (NetApp) Expires: August 2005 Tom Talpey (NetApp)
The Architecture of Direct Data Placement (DDP) The Architecture of Direct Data Placement (DDP)
and Remote Direct Memory Access (RDMA) and Remote Direct Memory Access (RDMA)
on Internet Protocols on Internet Protocols
draft-ietf-rddp-arch-06 draft-ietf-rddp-arch-07
Status of this Memo Status of this Memo
By submitting this Internet-Draft, I certify that any applicable By submitting this Internet-Draft, I certify that any applicable
patent or other IPR claims of which I am aware have been disclosed, patent or other IPR claims of which I am aware have been disclosed,
or will be disclosed, and any of which I become aware will be or will be disclosed, and any of which I become aware will be
disclosed, in accordance with RFC 3668. disclosed, in accordance with RFC 3668.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 37 skipping to change at page 1, line 37
progress." progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (2004). All Rights Reserved. Copyright (C) The Internet Society (2005). All Rights Reserved.
Abstract Abstract
This document defines an abstract architecture for Direct Data This document defines an abstract architecture for Direct Data
Placement (DDP) and Remote Direct Memory Access (RDMA) protocols to Placement (DDP) and Remote Direct Memory Access (RDMA) protocols to
run on Internet Protocol-suite transports. This architecture does run on Internet Protocol-suite transports. This architecture does
not necessarily reflect the proper way to implement such protocols, not necessarily reflect the proper way to implement such protocols,
but is, rather, a descriptive tool for defining and understanding but is, rather, a descriptive tool for defining and understanding
the protocols. DDP allows the efficient placement of data into the protocols. DDP allows the efficient placement of data into
buffers designated by Upper Layer Protocols (e.g. RDMA). RDMA buffers designated by Upper Layer Protocols (e.g. RDMA). RDMA
skipping to change at page 2, line 16 skipping to change at page 2, line 16
1. Introduction . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . 2 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . 2
1.2. DDP and RDMA Protocols . . . . . . . . . . . . . . . . . 3 1.2. DDP and RDMA Protocols . . . . . . . . . . . . . . . . . 3
2. Architecture . . . . . . . . . . . . . . . . . . . . . . 4 2. Architecture . . . . . . . . . . . . . . . . . . . . . . 4
2.1. Direct Data Placement (DDP) Protocol Architecture . . . 4 2.1. Direct Data Placement (DDP) Protocol Architecture . . . 4
2.1.1. Transport Operations . . . . . . . . . . . . . . . . . . 6 2.1.1. Transport Operations . . . . . . . . . . . . . . . . . . 6
2.1.2. DDP Operations . . . . . . . . . . . . . . . . . . . . . 7 2.1.2. DDP Operations . . . . . . . . . . . . . . . . . . . . . 7
2.1.3. Transport Characteristics in DDP . . . . . . . . . . . . 10 2.1.3. Transport Characteristics in DDP . . . . . . . . . . . . 10
2.2. Remote Direct Memory Access Protocol Architecture . . . 12 2.2. Remote Direct Memory Access Protocol Architecture . . . 12
2.2.1. RDMA Operations . . . . . . . . . . . . . . . . . . . . 13 2.2.1. RDMA Operations . . . . . . . . . . . . . . . . . . . . 14
2.2.2. Transport Characteristics in RDMA . . . . . . . . . . . 16 2.2.2. Transport Characteristics in RDMA . . . . . . . . . . . 16
3. Security Considerations . . . . . . . . . . . . . . . . 17 3. Security Considerations . . . . . . . . . . . . . . . . 17
3.1. Security Services . . . . . . . . . . . . . . . . . . . 18
3.2. Error Considerations . . . . . . . . . . . . . . . . . . 19
4. IANA Considerations . . . . . . . . . . . . . . . . . . 19 4. IANA Considerations . . . . . . . . . . . . . . . . . . 19
5. Acknowledgements . . . . . . . . . . . . . . . . . . . . 19 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . 20
Informative References . . . . . . . . . . . . . . . . . 19 Informative References . . . . . . . . . . . . . . . . . 20
Authors' Addresses . . . . . . . . . . . . . . . . . . . 20 Authors' Addresses . . . . . . . . . . . . . . . . . . . 21
Full Copyright Statement . . . . . . . . . . . . . . . . 20 Full Copyright Statement . . . . . . . . . . . . . . . . 21
1. Introduction 1. Introduction
This document defines an abstract architecture for Direct Data This document defines an abstract architecture for Direct Data
Placement (DDP) and Remote Direct Memory Access (RDMA) protocols to Placement (DDP) and Remote Direct Memory Access (RDMA) protocols to
run on Internet Protocol-suite transports. This architecture does run on Internet Protocol-suite transports. This architecture does
not necessarily reflect the proper way to implement such protocols, not necessarily reflect the proper way to implement such protocols,
but is, rather, a descriptive tool for defining and understanding but is, rather, a descriptive tool for defining and understanding
the protocols. This document uses C language notation as a the protocols. This document uses C language notation as a
shorthand to describe the architectural elements of DDP and RDMA shorthand to describe the architectural elements of DDP and RDMA
skipping to change at page 3, line 20 skipping to change at page 3, line 24
o Completion - informing any Upper Layer or application that a o Completion - informing any Upper Layer or application that a
particular operation has finished. A completion, for particular operation has finished. A completion, for
instance, may require the delivery of several messages, or it instance, may require the delivery of several messages, or it
may also reflect that some local processing has finished. may also reflect that some local processing has finished.
o Data Sink - the peer on which any placement occurs. o Data Sink - the peer on which any placement occurs.
o Data Source - the peer from which the placed data originates. o Data Source - the peer from which the placed data originates.
o Steering Tag - a "handle" used to identify memory which is the o Steering Tag - a "handle" used to identify the buffer which is
target of placement. A "tagged" message is one which the target of placement. A "tagged" message is one which
references such a handle. references such a handle.
o RDMA Write - an Operation which places data from a local data o RDMA Write - an Operation which places data from a local data
buffer to a remote data buffer specified by a Steering Tag. buffer to a remote data buffer specified by a Steering Tag.
o RDMA Read - an Operation which places data to a local data o RDMA Read - an Operation which places data to a local data
buffer specified by a Steering Tag from a remote data buffer buffer specified by a Steering Tag from a remote data buffer
specified by another Steering Tag. specified by another Steering Tag.
o Send - an Operation which places data from a local data buffer o Send - an Operation which places data from a local data buffer
skipping to change at page 11, line 42 skipping to change at page 11, line 42
o tagged message reception indications. o tagged message reception indications.
These relationships depend upon the characteristics of the These relationships depend upon the characteristics of the
underlying transport in a way which is defined by the DDP protocol. underlying transport in a way which is defined by the DDP protocol.
For example, if the transport is unreliable and unordered, the DDP For example, if the transport is unreliable and unordered, the DDP
protocol might specify that the client protocol is subject to the protocol might specify that the client protocol is subject to the
consequences of transport messages being lost or duplicated, rather consequences of transport messages being lost or duplicated, rather
than requiring different characteristics be presented to the client than requiring different characteristics be presented to the client
protocol. protocol.
Multidestination data delivery is the other transport Buffer access must be implemented consistently across endpoint IP
characteristic which may require specific consideration in a DDP addresses on transports allowing multiple IP addresses per
protocol. As mentioned above, the basic DDP model assumes that endpoint, for example, SCTP. In particular, the Steering Tag must
buffer address values returned by ddp_register() are opaque to the be consistently scoped and must address the same buffer across all
client protocol, and can be implementation dependent. The most IP address associations belonging to the endpoint. Additionally,
natural way to map DDP to a multidestination transport is to operation ordering relationships across IP addresses within an
require all receivers produce the same buffer address when association (set(), get(), etc.) depend on the underlying
registering a multidestination destination buffer. Restriction of transport. If the above consistency relationships cannot be
the DDP model to accommodate multiple destinations involves maintained by a transport endpoint, then the endpoint is unsuitable
engineering tradeoffs comparable to those of providing non-DDP for a DDP connection.
multidestination transport capability.
Multidestination data delivery is a transport characteristic which
may require specific consideration in a DDP protocol. As mentioned
above, the basic DDP model assumes that buffer address values
returned by ddp_register() are opaque to the client protocol, and
can be implementation dependent. The most natural way to map DDP
to a multidestination transport is to require all receivers produce
the same buffer address when registering a multidestination
destination buffer. Restriction of the DDP model to accommodate
multiple destinations involves engineering tradeoffs comparable to
those of providing non-DDP multidestination transport capability.
A registered buffer is identified within DDP by its stag_t, which A registered buffer is identified within DDP by its stag_t, which
in turn is associated with a socket. This registration therefore in turn is associated with a socket. This registration therefore
grants a capability to the DDP peer, and the socket (using the grants a capability to the DDP peer, and the socket (using the
underlying properties of its chosen transport and possible underlying properties of its chosen transport and possible
security) identifies the peer and authenticates the stag_t. security) identifies the peer and authenticates the stag_t.
The same buffer may be enabled by ddp_post_recv() on multiple The same buffer may be enabled by ddp_post_recv() on multiple
sockets. In this case any ddp_recv() untagged message reception sockets. In this case any ddp_recv() untagged message reception
indication may be provided on a different socket from that on which indication may be provided on a different socket from that on which
skipping to change at page 17, line 10 skipping to change at page 17, line 21
These difficulties can be overcome by placing restrictions on the These difficulties can be overcome by placing restrictions on the
service provided by RDMA. However, many RDMA clients, especially service provided by RDMA. However, many RDMA clients, especially
those that separate data transfer and application logic concerns, those that separate data transfer and application logic concerns,
are likely to depend upon capabilities only provided by RDMA on a are likely to depend upon capabilities only provided by RDMA on a
point-to-point, reliable transport. In other words, many potential point-to-point, reliable transport. In other words, many potential
Upper Layers which might avail themselves of RDMA services are Upper Layers which might avail themselves of RDMA services are
naturally already biased toward these transport classes. naturally already biased toward these transport classes.
3. Security Considerations 3. Security Considerations
Fundamentally, the DDP and RDMA protocols should not introduce Fundamentally, the DDP and RDMA protocols themselves should not
additional vulnerabilities. They are intermediate protocols and so introduce additional vulnerabilities. They are intermediate
should not perform or require functions such as authorization, protocols and so should not perform or require functions such as
which are the domain of Upper Layers. However, the DDP and RDMA authorization, which are the domain of Upper Layers. However, the
protocols should allow mapping by strict Upper Layers which are not DDP and RDMA protocols should allow mapping by strict Upper Layers
permissive of new vulnerabilities -- DDP and RDMAP implementations which are not permissive of new vulnerabilities -- DDP and RDMAP
should be prohibited from `cutting corners' that create new implementations should be prohibited from `cutting corners' that
vulnerabilities. Implementations must ensure that only `supplied' create new vulnerabilities. Implementations must ensure that only
resources (i.e. buffers) can be manipulated by DDP or RDMAP `supplied' resources (i.e. buffers) can be manipulated by DDP or
messages. RDMAP messages.
System integrity must be maintained in any RDMA solution. System integrity must be maintained in any RDMA solution.
Mechanisms must be specified to prevent RDMA or DDP operations from Mechanisms must be specified to prevent RDMA or DDP operations from
impairing system integrity. For example, threats can include impairing system integrity. For example, threats can include
potential buffer reuse or buffer overflow, and are not merely a potential buffer reuse or buffer overflow, and are not merely a
security issue. Even trusted peers must not be allowed to damage security issue. Even trusted peers must not be allowed to damage
local integrity. Any DDP and RDMA protocol must address the issue local integrity. Any DDP and RDMA protocol must address the issue
of giving end-systems and applications the capabilities to offer of giving end-systems and applications the capabilities to offer
protection from such compromises. protection from such compromises.
Because a Steering Tag exports access to a memory region, one Because a Steering Tag exports access to a buffer, one critical
critical aspect of security is the scope of this access. It must aspect of security is the scope of this access. It must be
be possible to individually control specific attributes of the possible to individually control specific attributes of the access
access provided by a Steering Tag on the endpoint (socket) on which provided by a Steering Tag on the endpoint (socket) on which it was
it was registered, including remote read access, remote write registered, including remote read access, remote write access, and
access, and others that might be identified. DDP and RDMA others that might be identified. DDP and RDMA specifications must
specifications must provide both implementation requirements provide both implementation requirements relevant to this issue,
relevant to this issue, and guidelines to assist implementors in and guidelines to assist implementors in making the appropriate
making the appropriate design decisions. design decisions.
For example, it must not be possible for DDP to enable evasion of For example, it must not be possible for DDP to enable evasion of
memory consistency checks at the recipient. The DDP and RDMA buffer consistency checks at the recipient. The DDP and RDMA
specifications must allow the recipient to rely on its consistent specifications must allow the recipient to rely on its consistent
memory contents by explicitly controlling peer access to memory buffer contents by explicitly controlling peer access to buffer
regions at appropriate times. regions at appropriate times.
Peer connections which do not pass authentication and authorization
checks by upper layers must not be permitted to begin processing in
RDMA mode with an inappropriate endpoint. Once associated, peer
accesses to memory regions must be authenticated and made subject
to authorization checks in the context of the association and
endpoint (socket) on which they are to be performed, prior to any
transfer operation or data being accessed. The RDMA protocols must
ensure that these region protections be under strict application
control.
The use of DDP and RDMA on a transport connection may interact with The use of DDP and RDMA on a transport connection may interact with
any security mechanism, and vice-versa. For example, if the any security mechanism, and vice-versa. For example, if the
security mechanism is implemented above the transport layer, the security mechanism is implemented above the transport layer, the
DDP and RDMA headers may not be protected. Such a layering may DDP and RDMA headers may not be protected. Such a layering may
therefore be inappropriate, depending on requirements. therefore be inappropriate, depending on requirements.
3.1. Security Services
The following end-to-end security services protect DDP and RDMAP
operation streams:
o Authentication of the data source, to protect against peer
impersonation, stream hijacking, and man-in-the-middle attacks
exploiting capabilities offered by the RDMA implementation.
Peer connections which do not pass authentication and
authorization checks must not be permitted to begin processing
in RDMA mode with an inappropriate endpoint. Once associated,
peer accesses to buffer regions must be authenticated and made
subject to authorization checks in the context of the
association and endpoint (socket) on which they are to be
performed, prior to any transfer operation or data being
accessed. The RDMA protocols must ensure that these region
protections be under strict application control.
o Integrity, to protect against modification of the control
content and buffer content.
While integrity is of concern to any transport, it is
important for the DDP and RDMAP protocols that the RDMA
control information carried in each operation be protected, in
order to direct the payloads appropriately.
o Sequencing, to protect against replay attacks (a special case
of the above modifications).
o Confidentiality, to protect the stream from eavesdropping.
IPsec, operating to secure the connection on a packet-by-packet IPsec, operating to secure the connection on a packet-by-packet
basis, seems to be a natural fit to securing RDMA placement, which basis, is a natural fit to securing RDMA placement, which operates
operates in conjunction with transport. Because RDMA enables an in conjunction with transport. Because RDMA enables an
implementation to avoid buffering, it is preferable to perform all implementation to avoid buffering, it is preferable to perform all
applicable security protection prior to processing of each segment applicable security protection prior to processing of each segment
by the transport and RDMA layers. Such a layering enables the most by the transport and RDMA layers. Such a layering enables the most
efficient secure RDMA implementation. efficient secure RDMA implementation.
The TLS record protocol, on the other hand, is layered on top of The TLS record protocol, on the other hand, is layered on top of
reliable transports and cannot provide such security assurance reliable transports and cannot provide such security assurance
until an entire record is available, which may require the until an entire record is available, which may require the
buffering and/or assembly of several distinct messages prior to TLS buffering and/or assembly of several distinct messages prior to TLS
processing. This defers RDMA processing and introduces overheads processing. This defers RDMA processing and introduces overheads
that RDMA is designed to avoid. TLS therefore is viewed as that RDMA is designed to avoid. In addition, TLS length
potentially a less natural fit for protecting the RDMA protocols. restrictions on records themselves impose additional buffering and
processing, for long operations which must span multiple records.
TLS therefore is viewed as potentially a less natural fit for
protecting the RDMA protocols.
Any DDP and RDMAP specification must provide the means to satisfy
the above security service requirements.
IPsec is sufficient to provide the required security services to
the DDP and RDMAP protocols, while enabling efficient
implementations.
3.2. Error Considerations
Resource issues leading to denial-of-service attacks, overwrites Resource issues leading to denial-of-service attacks, overwrites
and other concurrent operations, the ordering of completions as and other concurrent operations, the ordering of completions as
required by the RDMA protocol, and the granularity of transfer are required by the RDMA protocol, and the granularity of transfer are
all within the required scope of any security analysis of RDMA and all within the required scope of any security analysis of RDMA and
DDP. DDP.
The RDMA operations require checking of what is essentially user The RDMA operations require checking of what is essentially user
information, explicitly including addressing information and information, explicitly including addressing information and
operation type (read or write), and implicitly including protection operation type (read or write), and implicitly including protection
skipping to change at page 19, line 39 skipping to change at page 20, line 34
Specification Volumes 1 and 2", Release 1.1, November 2002, Specification Volumes 1 and 2", Release 1.1, November 2002,
available from http://www.infinibandta.org/specs available from http://www.infinibandta.org/specs
[MYR] [MYR]
VMEbus International Trade Association, "Myrinet on VME VMEbus International Trade Association, "Myrinet on VME
Protocol Specification", ANSI/VITA 26-1998, August 1998, Protocol Specification", ANSI/VITA 26-1998, August 1998,
available from http://www.myri.com/open-specs available from http://www.myri.com/open-specs
[ROM] [ROM]
A. Romanow, J. Mogul, T. Talpey and S. Bailey, "RDMA over IP A. Romanow, J. Mogul, T. Talpey and S. Bailey, "RDMA over IP
Problem Statement", draft-ietf-rddp-problem-statement-05, Work Problem Statement", draft-ietf-rddp-problem-statement,
in Progress, October 2004 Internet Draft Work in Progress
[SCTP] [SCTP]
R. Stewart et al., "Stream Transmission Control Protocol", RFC R. Stewart et al., "Stream Transmission Control Protocol", RFC
2960, Standards Track 2960, Standards Track
[SDP] [SDP]
InfiniBand Trade Association, "Sockets Direct Protocol v1.0", InfiniBand Trade Association, "Sockets Direct Protocol v1.0",
Annex A of InfiniBand Architecture Specification Volume 1, Annex A of InfiniBand Architecture Specification Volume 1,
Release 1.1, November 2002, available from Release 1.1, November 2002, available from
http://www.infinibandta.org/specs http://www.infinibandta.org/specs
skipping to change at page 20, line 34 skipping to change at page 21, line 30
Tom Talpey Tom Talpey
Network Appliance Network Appliance
375 Totten Pond Road 375 Totten Pond Road
Waltham, MA 02451 USA Waltham, MA 02451 USA
Phone: +1 781 768 5329 Phone: +1 781 768 5329
Email: thomas.talpey@netapp.com Email: thomas.talpey@netapp.com
Full Copyright Statement Full Copyright Statement
Copyright (C) The Internet Society (2004). This document is Copyright (C) The Internet Society (2005). This document is
subject to the rights, licenses and restrictions contained in BCP subject to the rights, licenses and restrictions contained in BCP
78 and except as set forth therein, the authors retain all their 78 and except as set forth therein, the authors retain all their
rights. rights.
This document and the information contained herein are provided on This document and the information contained herein are provided on
an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT
THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/