draft-ietf-nfsv4-rpcrdma-cm-pvt-data-02.txt   draft-ietf-nfsv4-rpcrdma-cm-pvt-data-03.txt 
Network File System Version 4 C. Lever Network File System Version 4 C. Lever
Internet-Draft Oracle Internet-Draft Oracle
Intended status: Informational May 5, 2019 Intended status: Informational June 13, 2019
Expires: November 6, 2019 Expires: December 15, 2019
RDMA Connection Manager Private Data For RPC-Over-RDMA Version 1 RDMA Connection Manager Private Data For RPC-Over-RDMA Version 1
draft-ietf-nfsv4-rpcrdma-cm-pvt-data-02 draft-ietf-nfsv4-rpcrdma-cm-pvt-data-03
Abstract Abstract
This document specifies the format of RDMA-CM Private Data exchanged This document specifies the format of RDMA-CM Private Data exchanged
between RPC-over-RDMA version 1 peers as part of establishing a between RPC-over-RDMA version 1 peers as part of establishing a
connection. Such private data is used to indicate peer support for connection. Such private data is used to indicate peer support for
remote invalidation and larger-than-default inline thresholds. remote invalidation and larger-than-default inline thresholds. The
addition of the private data payload specified in this document is an
OPTIONAL extension. The RPC-over-RDMA version 1 protocol does not
require the payload to be present.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 6, 2019. This Internet-Draft will expire on December 15, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 12 skipping to change at page 3, line 15
relieve these constraints for existing RPC-over-RDMA version 1 relieve these constraints for existing RPC-over-RDMA version 1
implementations. implementations.
This document specifies a simple, non-XDR-based message format This document specifies a simple, non-XDR-based message format
designed to be passed between RPC-over-RDMA version 1 peers at the designed to be passed between RPC-over-RDMA version 1 peers at the
time each RDMA transport connection is first established. The time each RDMA transport connection is first established. The
purpose of such a message exchange is to enable the connecting peers purpose of such a message exchange is to enable the connecting peers
to indicate support for transport properties that are not defined in to indicate support for transport properties that are not defined in
the base RPC-over-RDMA version 1 protocol defined in [RFC8166]. the base RPC-over-RDMA version 1 protocol defined in [RFC8166].
The message format can be extended as needed. In addition, The message format is intended to be further extensible within the
interoperation between implementations of RPC-over-RDMA version 1 normal scope of such IETF work (see Section 5 for further details).
that present this message format to peers and those that do not Section 6 of the current document defines an IANA registry for this
recognize this message format is guaranteed. purpose. In addition, interoperation between implementations of RPC-
over-RDMA version 1 that present this message format to peers and
those that do not recognize this message format is guaranteed.
2. Requirements Language 2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP "OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
3. Advertised Transport Properties 3. Advertised Transport Properties
skipping to change at page 4, line 11 skipping to change at page 4, line 15
LOOKUP and GETATTR. The use of RPCSEC_GSS security also increases LOOKUP and GETATTR. The use of RPCSEC_GSS security also increases
the average size of RPC messages, due to the larger size of the average size of RPC messages, due to the larger size of
RPCSEC_GSS credential material included in RPC headers [RFC7861]. RPCSEC_GSS credential material included in RPC headers [RFC7861].
If a sender and receiver could somehow agree on larger inline If a sender and receiver could somehow agree on larger inline
thresholds, frequently-used RPC transactions avoid the cost of thresholds, frequently-used RPC transactions avoid the cost of
explicit RDMA operations. explicit RDMA operations.
3.2. Remote Invalidation 3.2. Remote Invalidation
After an RDMA data transfer operation completes, an RDMA peer can use After an RDMA data transfer operation completes, an RDMA consumer can
remote invalidation to request that the remote peer RNIC invalidate use remote invalidation to request that the remote peer RNIC
an STag associated with the data transfer [RFC5042]. invalidate an STag associated with the data transfer [RFC5042].
An RDMA consumer requests remote invalidation by posting an RDMA Send An RDMA consumer requests remote invalidation by posting an RDMA Send
With Invalidate Work Request in place of an RDMA Send Work Request. With Invalidate Work Request in place of an RDMA Send Work Request.
Each RDMA Send With Invalidate carries one STag to invalidate. The Each RDMA Send With Invalidate carries one STag to invalidate. The
receiver of an RDMA Send With Invalidate performs the requested receiver of an RDMA Send With Invalidate performs the requested
invalidation and then reports that invalidation as part of the invalidation and then reports that invalidation as part of the
completion of a waiting Receive Work Request. completion of a waiting Receive Work Request.
An RPC-over-RDMA responder can use remote invalidation when replying If both peers support remote invalidation, an RPC-over-RDMA responder
to an RPC request that provided Read or Write chunks. The requester might use remote invalidation when replying to an RPC request that
thus avoids dispatching an extra Work Request, the resulting context provided chunks. Because one of the chunks has already been
switch, and the invalidation completion interrupt as part of invalidated, finalizing the results of the RPC is made simpler and
completing an RPC transaction that uses chunks. The upshot is faster faster.
completion of RPC transactions that involve RDMA data transfer.
There are some important caveats which contraindicate the blanket use However, there are some important caveats which contraindicate the
of remote invalidation: blanket use of remote invalidation:
o Remote invalidation is not supported by all RNICs. o Remote invalidation is not supported by all RNICs.
o Not all RPC-over-RDMA responder implementations can generate RDMA o Not all RPC-over-RDMA responder implementations can generate RDMA
Send With Invalidate Work Requests. Send With Invalidate Work Requests.
o Not all RPC-over-RDMA requester implementations can recognize when o Not all RPC-over-RDMA requester implementations can recognize when
remote invalidation has occurred. remote invalidation has occurred.
o On one connection in different RPC-over-RDMA transactions, or in a o On one connection in different RPC-over-RDMA transactions, or in a
skipping to change at page 5, line 5 skipping to change at page 5, line 8
some that must not be. No indication is provided at the RDMA some that must not be. No indication is provided at the RDMA
layer as to which is which. layer as to which is which.
A responder therefore must not employ remote invalidation unless it A responder therefore must not employ remote invalidation unless it
is aware of support for it in its own RDMA stack, and on the is aware of support for it in its own RDMA stack, and on the
requester. And, without altering the XDR structure of RPC-over-RDMA requester. And, without altering the XDR structure of RPC-over-RDMA
version 1 messages, it is not possible to support remote invalidation version 1 messages, it is not possible to support remote invalidation
with requesters that mix STags that may and must not be invalidated with requesters that mix STags that may and must not be invalidated
remotely in a single RPC or on the same connection. remotely in a single RPC or on the same connection.
However, it is possible to provide a simple signaling mechanism for a There are some NFS/RDMA client implementations whose STags are always
requester to indicate it can deal with remote invalidation of any safe to invalidate remotely. For such clients, indicating to the
STag it has presented to a responder. There are some NFS/RDMA client responder that remote invalidation is always safe can allow such
implementations that can successfully make use of such a signaling invalidation without the need for additional protocol to be defined.
mechanism.
4. Private Data Message Format 4. Private Data Message Format
With an InfiniBand lower layer, for example, RDMA connection setup With an InfiniBand lower layer, for example, RDMA connection setup
uses a Connection Manager when establishing a Reliable Connection uses a Connection Manager when establishing a Reliable Connection
[IBARCH]. When an RPC-over-RDMA version 1 transport connection is [IBARCH]. When an RPC-over-RDMA version 1 transport connection is
established, the client (which actively establishes connections) and established, the client (which actively establishes connections) and
the server (which passively accepts connections) populate the CM the server (which passively accepts connections) populate the CM
Private Data field exchanged as part of CM connection establishment. Private Data field exchanged as part of CM connection establishment.
The transport properties exchanged via this mechanism are fixed for The transport properties exchanged via this mechanism are fixed for
the life of the connection. Each new connection presents an the life of the connection. Each new connection presents an
opportunity for a fresh exchange. opportunity for a fresh exchange. An implementation of the extension
described in this document MUST be prepared for the settings to
change upon a reconnection.
For RPC-over-RDMA version 1, the CM Private Data field is formatted For RPC-over-RDMA version 1, the CM Private Data field is formatted
as described in the following subsection. RPC clients and servers as described in the following subsection. RPC clients and servers
use the same format. If the capacity of the Private Data field is use the same format. If the capacity of the Private Data field is
too small to contain this message format, the underlying RDMA too small to contain this message format, the underlying RDMA
transport is not managed by a Connection Manager, or the underlying transport is not managed by a Connection Manager, or the underlying
RDMA transport uses Private Data for its own purposes, the CM Private RDMA transport uses Private Data for its own purposes, the CM Private
Data field cannot be used on behalf of RPC-over-RDMA version 1. Data field cannot be used on behalf of RPC-over-RDMA version 1.
The first 8 octets of the CM Private Data field is to be formatted as The first 8 octets of the CM Private Data field is to be formatted as
skipping to change at page 9, line 7 skipping to change at page 9, line 7
Table 1: RDMA-CM Private Data Identifier Registry Table 1: RDMA-CM Private Data Identifier Registry
The Expert Review policy, as defined in Section 4.5 of [RFC8126] is The Expert Review policy, as defined in Section 4.5 of [RFC8126] is
to be used to handle requests to add new entries to the "File to be used to handle requests to add new entries to the "File
Provenance Information Registry". New protocol numbers can be Provenance Information Registry". New protocol numbers can be
assigned at random as long as they do not conflict with existing assigned at random as long as they do not conflict with existing
entries in this registry. entries in this registry.
7. Security Considerations 7. Security Considerations
RDMA-CM Private Data typically traverses the link layer in the clear. The private data extension specified in this document inherits the
A man-in-the-middle attack could alter the settings exchanged at security considerations of the link layer protocols it extends; e.g.,
connect time such that one or both peers might perform operations the MPA protocol, as specified in [RFC5044] and extended in
that result in premature termination of the connection. [RFC6581]. Additional relevant analysis of RDMA security appears in
the Security Considerations section of [RFC5042].
8. References 8. References
8.1. Normative References 8.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
skipping to change at page 10, line 10 skipping to change at page 10, line 10
[IBARCH] InfiniBand Trade Association, "InfiniBand Architecture [IBARCH] InfiniBand Trade Association, "InfiniBand Architecture
Specification Volume 1", Release 1.3, March 2015, Specification Volume 1", Release 1.3, March 2015,
<http://www.infinibandta.org/content/ <http://www.infinibandta.org/content/
pages.php?pg=technology_download>. pages.php?pg=technology_download>.
[RFC1813] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS [RFC1813] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS
Version 3 Protocol Specification", RFC 1813, Version 3 Protocol Specification", RFC 1813,
DOI 10.17487/RFC1813, June 1995, DOI 10.17487/RFC1813, June 1995,
<https://www.rfc-editor.org/info/rfc1813>. <https://www.rfc-editor.org/info/rfc1813>.
[RFC5044] Culley, P., Elzur, U., Recio, R., Bailey, S., and J.
Carrier, "Marker PDU Aligned Framing for TCP
Specification", RFC 5044, DOI 10.17487/RFC5044, October
2007, <https://www.rfc-editor.org/info/rfc5044>.
[RFC5531] Thurlow, R., "RPC: Remote Procedure Call Protocol [RFC5531] Thurlow, R., "RPC: Remote Procedure Call Protocol
Specification Version 2", RFC 5531, DOI 10.17487/RFC5531, Specification Version 2", RFC 5531, DOI 10.17487/RFC5531,
May 2009, <https://www.rfc-editor.org/info/rfc5531>. May 2009, <https://www.rfc-editor.org/info/rfc5531>.
[RFC5666] Talpey, T. and B. Callaghan, "Remote Direct Memory Access [RFC5666] Talpey, T. and B. Callaghan, "Remote Direct Memory Access
Transport for Remote Procedure Call", RFC 5666, Transport for Remote Procedure Call", RFC 5666,
DOI 10.17487/RFC5666, January 2010, DOI 10.17487/RFC5666, January 2010,
<https://www.rfc-editor.org/info/rfc5666>. <https://www.rfc-editor.org/info/rfc5666>.
[RFC6581] Kanevsky, A., Ed., Bestler, C., Ed., Sharp, R., and S.
Wise, "Enhanced Remote Direct Memory Access (RDMA)
Connection Establishment", RFC 6581, DOI 10.17487/RFC6581,
April 2012, <https://www.rfc-editor.org/info/rfc6581>.
[RFC7530] Haynes, T., Ed. and D. Noveck, Ed., "Network File System [RFC7530] Haynes, T., Ed. and D. Noveck, Ed., "Network File System
(NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530, (NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530,
March 2015, <https://www.rfc-editor.org/info/rfc7530>. March 2015, <https://www.rfc-editor.org/info/rfc7530>.
[RFC7861] Adamson, A. and N. Williams, "Remote Procedure Call (RPC) [RFC7861] Adamson, A. and N. Williams, "Remote Procedure Call (RPC)
Security Version 3", RFC 7861, DOI 10.17487/RFC7861, Security Version 3", RFC 7861, DOI 10.17487/RFC7861,
November 2016, <https://www.rfc-editor.org/info/rfc7861>. November 2016, <https://www.rfc-editor.org/info/rfc7861>.
Acknowledgments Acknowledgments
 End of changes. 13 change blocks. 
30 lines changed or deleted 46 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/