Network File System Version 4                              C. Lever, Ed.
Internet-Draft                                                    Oracle
Obsoletes: 5667 (if approved)                              June 13, 30, 2016
Intended status: Standards Track
Expires: December 15, 2016 January 1, 2017

     Network File System (NFS) Direct Data Placement
                     draft-ietf-nfsv4-rfc5667bis-00 Upper Layer Binding To RPC-Over-RDMA
                     draft-ietf-nfsv4-rfc5667bis-01

Abstract

   This document defines specifies the bindings Upper Layer Bindings of the various Network File
   System (NFS) protocol versions to the Remote Direct Memory Access (RDMA) operations
   supported by the RPC-over-RDMA transport protocol.  It describes the transports.  Such
   Upper Layer Bindings are required to enable RPC-based protocols to
   use of direct data placement by means of server-initiated RDMA
   operations into client-supplied buffers for implementations of NFS
   versions 2, 3, 4, and 4.1 over such an RDMA transport. when conveying large data payloads on RPC-
   over-RDMA transports.  This document obsoletes RFC 5667.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on December 15, 2016. January 1, 2017.

Copyright Notice

   Copyright (c) 2016 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  Requirements Language . . . . . . . . . . . . . . . . . .   2   3
     1.2.  Changes Since RFC 5667  . . . . . . . . . . . . . . . . .   3
     1.3.  Planned Changes To This Document  . . . . . . . . . . . .   2   4
   2.  Transfers from NFS Client to  Conveying NFS Server Operations On RPC-Over-RDMA Transports  . . . .   4
     2.1.  Use Of The Read List  . . . . . . .   3
   3.  Transfers from NFS Server to NFS Client . . . . . . . . . . .   3
   4.  NFS Versions 2 and 3 Mapping   4
     2.2.  Use Of The Write List . . . . . . . . . . . . . . . . . .   5
   5.
     2.3.  Construction Of Individual Chunks . . . . . . . . . . . .   5
     2.4.  Use Of Long Calls And Replies . . . . . . . . . . . . . .   5
   3.  NFS Versions 2 And 3 Upper Layer Binding  . . . . . . . . . .   5
   4.  NFS Version 4 Mapping Upper Layer Binding . . . . . . . . . . . . . .   6
     4.1.  NFS Version 4 COMPOUND Considerations . . . . . . . . . .   6
     5.1.   7
     4.2.  NFS Version 4 Callbacks . . . . . . . . . . . . . . . . .   8
   6.
   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
   7.   8
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .   9
   8.
   7.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .   9
   9.
   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  10
     9.1.   9
     8.1.  Normative References  . . . . . . . . . . . . . . . . . .  10
     9.2.   9
     8.2.  Informative References  . . . . . . . . . . . . . . . . .  10
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  11

1.  Introduction

   The

   Remote Direct Memory Access (RDMA) Transport for Remote Procedure
   Call (RPC) Call,
   Version One [I-D.ietf-nfsv4-rfc5666bis] allows an RPC client
   application (RPC-over-RDMA) enables the
   use of direct data placement to post buffers in a Chunk list for specific arguments
   and results from an accelerate the transmission of large
   data payloads associated with RPC call.  The RDMA transactions.

   Each RPC-over-RDMA transport header conveys this
   list of client buffer addresses to the server where the application can associate them with client convey lists of memory
   locations involved in direct transfers of data and use RDMA operations to
   transfer the results directly payloads.  These
   memory locations correspond to and from the posted buffers on the
   client.  The XDR data items defined in an Upper
   Layer Protocol (such as NFS).

   To facilitate interoperation, RPC client and server implementations
   must agree on a consistent mapping of
   posted buffers to RPC. what XDR data items in which RPC procedures are
   eligible for direct data placement (DDP).

   This document details specifies the mapping for set of XDR data items in each
   version of the
   following NFS protocol versions that are eligible for DDP.  It also
   contains additional material required of Upper Layer Bindings as
   specified in [I-D.ietf-nfsv4-rfc5666bis].

   o  NFS Version 2 [RFC1094]
   o  NFS Version 3 [RFC1813]

   o  NFS Version 4.0 [RFC7530] [RFC5661].

   o  NFS Version 4.1 [RFC5661]

   o  NFS Version 4.2 [I-D.ietf-nfsv4-minorversion2]

   The Upper Layer Binding specified in this document can be extended to
   cover the addition of new DDP-eligible XDR data items defined by
   versions of the NFS version 4 protocol specified after this document
   has been ratified.

1.1.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

1.2.  Planned  Changes To This Document Since RFC 5667

   Corrections and updates made necessary by new language in
   [I-D.ietf-nfsv4-rfc5666bis] has been introduced.  For example,
   references to deprecated features of RPC-over-RDMA Version One, such
   as RDMA_MSGP, and the use of the Read list for handling RPC replies,
   has been removed.  The term "mapping" has been replaced with the term
   "binding" or "Upper Layer Binding" throughout the document.  Material
   that duplicates what is in [I-D.ietf-nfsv4-rfc5666bis] has been
   deleted.

   Material required by [I-D.ietf-nfsv4-rfc5666bis] for Upper Layer
   Bindings that was not present in [RFC5667] has been added, including
   discussion of how each NFS version properly estimates the maximum
   size of RPC replies.

   The following changes will be have been made, relative to [RFC5667]:

   o  Ambiguous or erroneous uses of RFC2119 terms have been corrected.

   o  References to [RFC5666] will be replaced with references to
      [I-D.ietf-nfsv4-rfc5666bis].  Corrections and updates relative to
      new language in [I-D.ietf-nfsv4-rfc5666bis] will be introduced. specific data movement mechanisms have been made
      generic or removed.

   o  References to obsolete RFCs will be have been replaced.

   o  Technical corrections have been made.  For example, the mention of
      12KB and 36KB inline thresholds have been removed.  The reference
      to a non-existant NFSv4 NFS version 4 SYMLINK operation will be has been
      replaced with NFSv4 NFS version 4 CREATE(NF4LNK).

   o  The discussion of 12KB  An IANA Considerations Section has replaced the "Port Usage
      Considerations" Section.

   o  Code excerpts have been removed, and 36KB inline threshold will be removed. figures have been modernized.

   o  The discussion of NFSv4 COMPOUND handling will be completed.  Language inconsistent with or contradictory to
      [I-D.ietf-nfsv4-rfc5666bis] has been removed from Sections 2 and
      3, and both Sections have been combined into Section 2 in the
      present document.

   o  An explicit discussion of NFSv4.0 and NFSv4.1 backchannel
      operation will be introduced.

   o  An IANA Considerations section replace the previous treatment of callback
      operations.  No NFSv4.x callback operation is required by IDNITS. DDP-eligible.

   o  Code excerpts  The binding for NFSv4.1 has been completed.  No additional DDP-
      eligible operations exist in NFSv4.1.

   o  A binding for NFSv4.2 has been added that includes discussion of
      new data-bearing operations like READ_PLUS.

1.3.  Planned Changes To This Document

   The following changes are planned, relative to [RFC5667]:

   o  The discussion of NFS version 4 COMPOUND handling will be modernized.

   Other minor changes and editorial corrections may also
      completed.

   o  Remarks about handling DDP-eligibility violations will be made.
      introduced.

   o  A discussion of how the NFS binding to RPC-over-RDMA is extended
      by standards action will be added.

2.  Transfers from  Conveying NFS Client Operations On RPC-Over-RDMA Transports

   Definitions of terminology and a general discussion of how RPC-over-
   RDMA is used to convey RPC transactions can be found in
   [I-D.ietf-nfsv4-rfc5666bis].  In this section, these general
   principals are applied to the specifics of the NFS Server protocol.

2.1.  Use Of The RDMA Read list, List

   The Read list in the RDMA each RPC-over-RDMA transport header, allows an RPC
   client to marshal RPC call data selectively.  Large chunks header represents a set
   of data, memory regions containing DDP-eligible NFS argument data.  Large
   data items, such as the file data payload of an NFS WRITE request, MAY be
   are referenced by
   an RDMA the Read list and be moved efficiently and directly placed by an
   RDMA Read operation initiated by the server.

   The process of identifying these chunks for the RDMA Read list can be
   implemented entirely within the RPC layer.  It is transparent to the
   upper-level protocol, such as NFS.  For instance, the file data
   portion of an NFS WRITE request can be selected as an RDMA "chunk"
   within the eXternal Data Representation (XDR) marshaling code of RPC
   based on a size criterion, independently of the NFS protocol layer.
   The directly into server
   memory.

   XDR unmarshaling code on the receiving system can identify NFS server identifies the correspondence
   between Read chunks and protocol elements particular NFS arguments via the XDR
   position chunk
   Position value encoded in the Read chunk entry.

   RPC RDMA each Read chunks are employed by this NFS mapping to convey
   specific NFS data to the server in a manner that may be directly
   placed. chunk.

2.2.  Use Of The following sections describe this mapping for versions of
   the NFS protocol.

3.  Transfers from NFS Server to NFS Client Write List

   The RDMA Write list, list in the RDMA each RPC-over-RDMA transport header, allows header represents a
   set of memory regions that can receive DDP-eligible NFS result data.
   Large data items such as the client
   to post one or more buffers into which payload of an NFS READ request are
   referenced by the server will RDMA Write
   designated result chunks directly.  If the list and placed directly into client sends a null memory.

   Each Write
   list, then results from the RPC call will be returned either as an
   inline reply, as chunks chunk corresponds to a specific XDR data item in an RDMA NFS
   reply.  This document specifies how NFS client and server
   implementations identify the correspondence between Write chunks and
   each XDR result.

2.3.  Construction Of Individual Chunks

   Each Read chunk is represented as a list of server-posted
   buffers, or in a client-posted reply buffer.

   Each posted buffer in a segments at the same XDR
   Position, and each Write list chunk is represented as an array of
   memory
   segments.  This allows the  An NFS client some thus has the flexibility in
   submitting to advertise a set
   of discontiguous memory segments into regions in which the server will
   scatter the result.  Each segment is described by to send or receive a triplet
   consisting single
   DDP-eligible data item.

2.4.  Use Of Long Calls And Replies

   Small RPC messages are conveyed using RDMA Send operations which are
   of the segment handle or steering tag (STag), segment
   length, limited size.  If an NFS request is too large to be conveyed via
   an RDMA Send, and memory address or offset.

   <CODE BEGINS>

      struct xdr_rdma_segment {
         uint32 handle;    /* Registered memory handle */
         uint32 length;    /* Length of there are no DDP-eligible data items that can be
   removed, an NFS client must send the chunk in bytes */
         uint64 offset;    /* Chunk virtual address or offset */
      };

      struct xdr_write_chunk {
         struct xdr_rdma_segment target<>;
      };

      struct xdr_write_list {
         struct xdr_write_chunk entry;
         struct xdr_write_list  *next;
      };

   <CODE ENDS> request using a Long Call.  The sum of the segment lengths yields the total size of the buffer,
   which MUST
   entire NFS request is sent in a special Read chunk.

   If a client expects that an NFS reply will be too large enough to accept the result.  If be
   conveyed via an RDMA Send, it provides a Reply chunk in the buffer is
   too small, RPC-over-
   RDMA transport header conveying the server MUST return an XDR encode error. NFS request.  The server
   MUST return can
   place the result data for a posted buffer by progressively
   filling its segments, perhaps leaving some trailing segments unfilled
   or partially full if the size of the result is less than the total
   size of the buffer segments.

   The server returns the RDMA Write list to the client with the segment
   length fields overwritten to indicate the amount of data RDMA written
   to each segment.  Results returned by direct placement MUST NOT be
   returned by other methods, e.g., by Read chunk list or inline.  If no
   result data at all is returned for the element, the server places no
   data entire NFS reply in the buffer(s), but does return zeros Reply chunk.

   These are described in the segment length
   fields corresponding to the result.

   The RDMA Write list allows the client to provide multiple result
   buffers -- each buffer maps to a specific result more detail in the reply.  The [I-D.ietf-nfsv4-rfc5666bis].

3.  NFS Versions 2 And 3 Upper Layer Binding

   An NFS client and server implementations agree by specifying the mapping
   of results MAY send a single Read chunk to buffers for each RPC procedure.  The following sections
   describe this mapping supply opaque file data
   for versions of the an NFS protocol.

   Through WRITE procedure, or the use of RDMA Write lists in pathname for an NFS requests, it is not
   necessary to employ the RDMA Read lists in the SYMLINK
   procedure.  For all other NFS replies, as
   described in the RPC-over-RDMA protocol.  This enables more efficient
   operation, by avoiding the need for procedures, the server to expose buffers for
   RDMA, MUST ignore Read
   chunks that have a non-zero value in their Position fields, and also avoiding "RDMA_DONE" exchanges.  Clients MAY
   additionally employ RDMA Reply Read
   chunks to receive entire messages, as
   described beyond the first in [I-D.ietf-nfsv4-rfc5666bis].

4. the Read list.

   Similarly, an NFS Versions 2 and 3 Mapping

   A client MAY provide a single RDMA Write list entry MAY be posted by the client chunk to receive
   either the opaque file data from a an NFS READ request procedure, or the pathname
   from
   a an NFS READLINK request. procedure.  The server MUST ignore a the Write
   list for any other NFS procedure, as well as and any Write list entries chunks beyond the
   first in the Write list.

   Similarly, a single RDMA Read list entry MAY be posted by the client
   to supply the opaque file data for a WRITE request or the pathname
   for a SYMLINK request.  The server MUST ignore any Read list for
   other NFS procedures, as well as additional Read list entries beyond
   the first in the list.

   Because there

   There are no NFS version 2 or 3 requests procedures that transfer bulk have DDP-eligible
   data items in both directions, it is not necessary to post requests
   containing both Write their Call and Read lists.  Any unneeded Read Reply.  However, if an NFS client
   is sending a Long Call or Reply, it MAY provide a combination of Read
   list, Write
   lists are ignored by list, and/or a Reply chunk in the server.

   In same transaction.

   NFS clients already successfully estimate the case where maximum reply size of
   each operation in order to provide an adequate set of buffers to
   receive each NFS reply.  An NFS client provides a Reply chunk when
   the outgoing request or expected incoming maximum possible reply size is larger than the maximum size supported on the connection, it is
   possible for client's responder
   inline threshold.

   How does the RPC layer to post server respond if the entire message or result in a
   special "RDMA_NOMSG" message type that is transferred entirely by
   RDMA.  This is implemented in RPC, below NFS, and therefore client has no
   effect on the message contents.

   Non-RDMA (inline) not provided enough
   Write list resources to handle an NFS WRITE transfers MAY OPTIONALLY employ the
   "RDMA_MSGP" padding method described in the RPC-over-RDMA protocol,
   if the appropriate value for or READLINK reply?  How
   does the server is known to the client.
   Padding allows respond if the opaque file data client has not provided enough Reply
   chunk resources to arrive at the server in handle an
   aligned fashion, which may improve server performance.

   The NFS version 2 and 3 protocols are frequently limited in practice
   to requests containing less than or equal reply?

4.  NFS Version 4 Upper Layer Binding

   This specification applies to 8 kilobytes NFS Version 4.0 [RFC7530], NFS Version
   4.1 [RFC5661], and 32
   kilobytes NFS Version 4.2 [I-D.ietf-nfsv4-minorversion2].
   It also applies to the callback protocols associated with each of data, respectively.  In
   these cases, it is often
   practical to support basic operation without employing minor versions.

   An NFS client MAY send a
   configuration exchange as discussed in [I-D.ietf-nfsv4-rfc5666bis].
   The server MUST post buffers large enough Read chunk to receive supply opaque file data for a
   WRITE operation or the largest
   possible incoming message (approximately 12 KB pathname for a CREATE(NF4LNK) operation in an
   NFS version 2, or
   36 KB for 4 COMPOUND procedure.  An NFS client MUST NOT send a Read
   chunk that corresponds with any other XDR data item in any other NFS
   version 3, would be vastly sufficient), and the 4 operation.

   Similarly, an NFS client
   can post buffers large enough MAY provide a Write chunk to receive replies based on the "rsize"
   it is using to the server, plus either
   opaque file data from a fixed overhead for READ operation, NFS4_CONTENT_DATA from a
   READ_PLUS operation, or the RPC and pathname from a READLINK operation in an
   NFS
   headers.  Because the server version 4 COMPOUND procedure.  An NFS client MUST NOT return provide a
   Write chunk that corresponds with any other XDR data item in excess of this
   size, the client can be assured of the adequacy of its posted buffer
   sizes.

   Flow control is handled dynamically by the RPC RDMA protocol, and
   write padding any
   other NFS version 4 operation.

   There is OPTIONAL no prohibition against an NFS version 4 COMPOUND procedure
   constructed with both a READ and therefore MAY remain unused.

   Alternatively, if the server WRITE operation, say.  Thus it is administratively configured to values
   appropriate
   possible for all its clients, the same assurance of
   interoperability within the domain can be made.

   The NFS version 4 COMPOUND procedures to use of a configuration protocol with both the Read
   list and Write list simultaneously.  An NFS v2 client MAY provide a Read
   list and v3 is therefore
   OPTIONAL.  Employing a configuration exchange may allow some
   advantage to server resource management through accurately sizing
   buffers, enabling Write list in the server same transaction if it is sending a Long
   Call or Reply.

   Some remarks need to know exactly how many RDMA Reads may be in progress at once on the client connection, made about how NFS version 4 clients estimate
   reply size, and enabling client
   write padding, which may be desirable for certain servers when RDMA
   Read is impractical.

5. how DDP-eligibility violations are reported.

4.1.  NFS Version 4 Mapping

   This specification applies to the first minor version of COMPOUND Considerations

   An NFS version 4 (NFSv4.0) and any subsequent minor versions that do not override
   this mapping.

   The Write list MUST be considered only for the COMPOUND procedure.
   This procedure returns results from supplies arguments for a sequence
   of operations.  Only
   the opaque file data operations, and returns results from that sequence.  A client MAY
   construct an NFS READ operation and the pathname from
   a READLINK operation MUST utilize entries from the Write list.

   If there is no Write list, i.e., version 4 COMPOUND procedure that uses more than one
   chunk in either the Read list is null, then any READ or
   READLINK operations in the COMPOUND MUST return their data inline. Write list.  The NFSv4.0 NFS client MUST ensure provides
   XDR Position values in this case that any result of its
   READ each Read chunk to disambiguate which chunk is
   associated with which XDR data item.

   However NFS server and READLINK requests will fit within its receive buffers, client implementations must agree in
   order advance
   on how to avoid a resulting RDMA transport error upon transfer. pair Write chunks with returned result data items.  The
   server
   mechanism specified in [I-D.ietf-nfsv4-rfc5666bis]) is not required to detect this. applied here:

   o  The first entry chunk in the Write list MUST be used by the first READ
      or READLINK operation in the an NFS version 4 COMPOUND request. procedure.  The
      next Write list entry chunk is used by the next READ or READLINK, and so on.

   o  If there are more READ or READLINK operations than Write list entries, chunks,
      then any remaining operations MUST return their results inline.

   o  If an NFS client presents a Write list entry is presented, chunk, then the corresponding
      READ or READLINK operation MUST return its data via an RDMA Write to the buffer
   indicated by the Write list entry. placing data
      into that chunk.

   o  If the Write list entry chunk has zero RDMA segments, or if the total size of
      the segments is zero, then the corresponding READ or READLINK
      operation MUST return its result inline.

   The following example shows an RDMA a Write list with three posted
   buffers Write chunks, A,
   B, and C.  The server consumes the provided Write chunks by writing
   the results of the designated operations in the compound request,
   READ and READLINK, consume the posted buffers by writing
   their results back to each buffer.

      RDMA chunk.

      Write list:

         A --> B --> C

      Compound

      NFS version 4 COMPOUND request:

         PUTFH LOOKUP READ PUTFH LOOKUP READLINK PUTFH LOOKUP READ
                       |                   |                   |
                       v                   v                   v
                       A                   B                   C

   If the client does not want to have the READLINK result returned
   directly, then it provides a zero-length array of segment triplets for
   buffer B or sets the values in the segment triplet for buffer B to
   zeros so to indicate that the READLINK result MUST must be returned inline.

   The situation is similar for RDMA Read lists sent by the client and
   applies to the NFSv4.0 WRITE and SYMLINK procedures as for v3.
   Additionally, inline segments too large to fit in posted buffers MAY
   be transferred in special "RDMA_NOMSG" messages.

   Non-RDMA (inline) WRITE transfers MAY OPTIONALLY employ the
   "RDMA_MSGP" padding method described in the RPC-over-RDMA protocol,
   if the appropriate value for the server is known to the client.
   Padding allows the opaque file data to arrive at the server in an
   aligned fashion, which may improve server performance.  In order to
   ensure accurate alignment for all data, it is likely that the client
   will restrict its use of OPTIONAL padding to COMPOUND requests
   containing only a single WRITE operation.

   Unlike NFS versions 2 and 3, the maximum size of an NFS version 4
   COMPOUND is not bounded, even when RDMA chunks are in use.  While it
   might appear that a configuration protocol exchange (such as the one
   described in [I-D.ietf-nfsv4-rfc5666bis]) would help, in fact the
   layering issues involved in building COMPOUNDs by NFS make such a
   mechanism unworkable. bounded.  However, typical NFS version 4 clients
   rarely issue such problematic requests.  In practice, they behave in much more predictable ways, in
   fact most still support the traditional rsize/wsize mount parameters.
   Therefore, most NFS version 4
   clients function over RPC-over-RDMA behave in
   the same way as NFS versions 2 and 3, operationally.

   There are however advantages to allowing both client much more predictable ways.  Rsize and server to
   operate with prearranged size constraints, for example, use of the
   sizes wsize apply
   to better manage COMPOUND operations by capping the server's response cache. total amount of data payload
   allowed in each COMPOUND.  An extension to NFS version 4 supporting a more
   comprehensive exchange of upper-layer message size parameters is part
   of [RFC5661].

5.1.

4.2.  NFS Version 4 Callbacks

   The NFS version 4 protocols support server-initiated callbacks to
   selected clients, in order to notify them of events such as recalled
   delegations, etc.  These callbacks present no particular issue to
   being framed over RPC-over-RDMA since such callbacks do not carry
   bulk data such as NFS READ or NFS WRITE.  They MAY be transmitted
   inline via RDMA_MSG, or if the callback message or its reply overflow
   the negotiated buffer sizes for a callback connection, they MAY be
   transferred via the RDMA_NOMSG method NFS version 4 protocols support server-initiated callbacks to
   notify clients of events such as described above for other
   exchanges.

   One special case is noteworthy: recalled delegations.  There are no
   DDP-eligible data items in callback protocols associated with
   NFSv4.0, NFSv4.1, or NFSv4.2.

   In NFS version 4.1, the 4.1 and 4.2, callback
   channel is optionally negotiated to be operations may appear on the
   same connection as one used for NFS version 4 client requests.  In this case, and because the transaction
   ID (XID) is present in the RPC-over-RDMA header, the client MUST
   ascertain whether the message is in fact an RPC REPLY, and therefore
   a reply to a prior request and carrying its XID, before processing it
   as such.  By the same token, the server MUST ascertain whether an
   incoming message  To
   operate on such a callback-eligible connection is an RPC
   CALL, before optionally processing the XID.

   In the callback case, the XID present in the RPC-over-RDMA header
   will potentially have any value, which may (or may not) collide with
   an XID used by the client for a previous or future request.  The
   client and server MUST inspect the RPC component of the message to
   determine its potential disposition as either an RPC CALL or RPC
   REPLY, prior to processing this XID, transports, NFS version 4 clients and
   servers MUST NOT reject or accept it
   without also determining use the proper context.

6. mechanism described in
   [I-D.ietf-nfsv4-rpcrdma-bidirection].

5.  IANA Considerations

   NFS use of direct data placement introduces a need for an additional
   NFS port number assignment for networks that share traditional UDP
   and TCP port spaces with RDMA services.  The iWARP [RFC5041]
   [RFC5040] protocol is such an example (InfiniBand is not).

   NFS servers for versions 2 and 3 [RFC1094] [RFC1813] traditionally
   listen for clients on UDP and TCP port 2049, and additionally, they
   register these with the portmapper and/or rpcbind [RFC1833] service.
   However, [RFC7530] requires NFS servers for version 4 to listen on
   TCP port 2049, and they are not required to register.

   An NFS version 2 or version 3 server supporting RPC-over-RDMA on such
   a network and registering itself with the RPC portmapper MAY choose
   an arbitrary port, or MAY use the alternative well-known port number
   for its RPC-over-RDMA service.  The chosen port MAY be registered
   with the RPC portmapper under the netid assigned by the requirement
   in [I-D.ietf-nfsv4-rfc5666bis].

   An NFS version 4 server supporting RPC-over-RDMA on such a network
   MUST use the alternative well-known port number for its RPC-over-RDMA
   service.  Clients SHOULD connect to this well-known port without
   consulting the RPC portmapper (as for NFSv4/TCP).

   The port number assigned to an NFS service over an RPC-over-RDMA
   transport is available from the IANA port registry [RFC3232].

7.

6.  Security Considerations

   The RDMA transport for RPC [I-D.ietf-nfsv4-rfc5666bis] supports all
   RPC [RFC5531] security models, including RPCSEC_GSS [RFC2203]
   security and link- level transport-level security.  The choice of RDMA Read and
   RDMA Write to return convey RPC argument and results, respectively, results does not affect this,
   since it only changes the method of data transfer.  Specifically, the
   requirements of [I-D.ietf-nfsv4-rfc5666bis] ensure that this choice
   does not introduce new vulnerabilities.

   Because this document defines only the binding of the NFS protocols
   atop [I-D.ietf-nfsv4-rfc5666bis], all relevant security
   considerations are therefore to be described at that layer.

8.

7.  Acknowledgments

   The author gratefully acknowledges the work of Brent Callaghan and
   Tom Talpey on the original NFS Direct Data Placement specification
   [RFC5667].  The author also wishes to thank Bill Baker and Greg
   Marsden for their support of this work.

9.

   Dave Noveck provided excellent review, constructive suggestions, and
   consistent navigational guidance throughout the process of drafting
   this document.

   Special thanks go to nfsv4 Working Group Chair Spencer Shepler and
   nfsv4 Working Group Secretary Thomas Haynes for their support.

8.  References

9.1.

8.1.  Normative References

   [I-D.ietf-nfsv4-minorversion2]
              Haynes, T., "NFS Version 4 Minor Version 2", draft-ietf-
              nfsv4-minorversion2-41 (work in progress), January 2016.

   [I-D.ietf-nfsv4-rfc5666bis]
              Lever, C., Simpson, W., and T. Talpey, "Remote Direct
              Memory Access Transport for Remote Procedure Call, Version
              One", draft-ietf-nfsv4-rfc5666bis-07 (work in progress),
              May 2016.

   [I-D.ietf-nfsv4-rpcrdma-bidirection]
              Lever, C., "Bi-directional Remote Procedure Call On RPC-
              over-RDMA Transports", draft-ietf-nfsv4-rpcrdma-
              bidirection-05 (work in progress), June 2016.

   [RFC1833]  Srinivasan, R., "Binding Protocols for ONC RPC Version 2",
              RFC 1833, DOI 10.17487/RFC1833, August 1995,
              <http://www.rfc-editor.org/info/rfc1833>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/
              RFC2119, 10.17487/RFC2119, March 1997,
              <http://www.rfc-editor.org/info/rfc2119>.

   [RFC2203]  Eisler, M., Chiu, A., and L. Ling, "RPCSEC_GSS Protocol
              Specification", RFC 2203, DOI 10.17487/RFC2203, September
              1997, <http://www.rfc-editor.org/info/rfc2203>.

   [RFC5531]  Thurlow, R., "RPC: Remote Procedure Call Protocol
              Specification Version 2", RFC 5531, DOI 10.17487/RFC5531,
              May 2009, <http://www.rfc-editor.org/info/rfc5531>.

   [RFC5661]  Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed.,
              "Network File System (NFS) Version 4 Minor Version 1
              Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010,
              <http://www.rfc-editor.org/info/rfc5661>.

   [RFC7530]  Haynes, T., Ed. and D. Noveck, Ed., "Network File System
              (NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530,
              March 2015, <http://www.rfc-editor.org/info/rfc7530>.

9.2.

8.2.  Informative References

   [I-D.ietf-nfsv4-rfc5666bis]
              Lever, C., Simpson, W., and T. Talpey, "Remote Direct
              Memory Access Transport for Remote Procedure Call, Version
              One", draft-ietf-nfsv4-rfc5666bis-07 (work in progress),
              May 2016.

   [RFC1094]  Nowicki, B., "NFS: Network File System Protocol
              specification", RFC 1094, DOI 10.17487/RFC1094, March
              1989, <http://www.rfc-editor.org/info/rfc1094>.

   [RFC1813]  Callaghan, B., Pawlowski, B., and P. Staubach, "NFS
              Version 3 Protocol Specification", RFC 1813,
              DOI 10.17487/
              RFC1813, 10.17487/RFC1813, June 1995,
              <http://www.rfc-editor.org/info/rfc1813>.

   [RFC3232]  Reynolds, J., Ed., "Assigned Numbers: RFC 1700 is Replaced
              by an On-line Database", RFC 3232, DOI 10.17487/RFC3232,
              January 2002, <http://www.rfc-editor.org/info/rfc3232>.

   [RFC5040]  Recio, R., Metzler, B., Culley, P., Hilland, J., and D.
              Garcia, "A Remote Direct Memory Access Protocol
              Specification", RFC 5040, DOI 10.17487/RFC5040, October
              2007, <http://www.rfc-editor.org/info/rfc5040>.

   [RFC5041]  Shah, H., Pinkerton, J., Recio, R., and P. Culley, "Direct
              Data Placement over Reliable Transports", RFC 5041,
              DOI 10.17487/RFC5041, October 2007,
              <http://www.rfc-editor.org/info/rfc5041>.

   [RFC5666]  Talpey, T. and B. Callaghan, "Remote Direct Memory Access
              Transport for Remote Procedure Call", RFC 5666, DOI
              10.17487/RFC5666, January 2010,
              <http://www.rfc-editor.org/info/rfc5666>.

   [RFC5667]  Talpey, T. and B. Callaghan, "Network File System (NFS)
              Direct Data Placement", RFC 5667, DOI 10.17487/RFC5667,
              January 2010, <http://www.rfc-editor.org/info/rfc5667>.

Author's Address

   Charles Lever (editor)
   Oracle Corporation
   1015 Granger Avenue
   Ann Arbor, MI  48104
   USA

   Phone: +1 734 274 2396
   Email: chuck.lever@oracle.com