draft-ietf-nfsv4-nfsdirect-03.txt | draft-ietf-nfsv4-nfsdirect-04.txt | |||
---|---|---|---|---|
Internet-Draft Tom Talpey | Internet-Draft Tom Talpey | |||
Expires: December 2006 Brent Callaghan | Expires: April 2007 Brent Callaghan | |||
Document: draft-ietf-nfsv4-nfsdirect-03 June, 2006 | Document: draft-ietf-nfsv4-nfsdirect-04 October, 2007 | |||
NFS Direct Data Placement | NFS Direct Data Placement | |||
Status of this Memo | Status of this Memo | |||
By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
skipping to change at page 2, line 9 | skipping to change at page 2, line 9 | |||
movement over the network to be implemented in RDMA hardware. This | movement over the network to be implemented in RDMA hardware. This | |||
draft describes the use of direct data placement by means of server- | draft describes the use of direct data placement by means of server- | |||
initiated RDMA operations into client-supplied buffers in a Chunk | initiated RDMA operations into client-supplied buffers in a Chunk | |||
list for implementations of NFS versions 2, 3, and 4 over an RDMA | list for implementations of NFS versions 2, 3, and 4 over an RDMA | |||
transport. | transport. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
2. Transfers from NFS Client to NFS Server . . . . . . . . . . 2 | 2. Transfers from NFS Client to NFS Server . . . . . . . . . . 2 | |||
3. Transfers from NFS Server to NFS Client . . . . . . . . . . 2 | 3. Transfers from NFS Server to NFS Client . . . . . . . . . . 3 | |||
4. NFS Versions 2 and 3 Mapping . . . . . . . . . . . . . . . . 4 | 4. NFS Versions 2 and 3 Mapping . . . . . . . . . . . . . . . . 4 | |||
5. NFS Version 4 Mapping . . . . . . . . . . . . . . . . . . . 5 | 5. NFS Version 4 Mapping . . . . . . . . . . . . . . . . . . . 5 | |||
6. Security . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 6. Security . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . 7 | 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . 7 | |||
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 | 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 | |||
9. Normative References . . . . . . . . . . . . . . . . . . . . 8 | 9. Normative References . . . . . . . . . . . . . . . . . . . . 8 | |||
10. Informative References . . . . . . . . . . . . . . . . . . 8 | 10. Informative References . . . . . . . . . . . . . . . . . . 9 | |||
11. Authors' Addresses . . . . . . . . . . . . . . . . . . . . 9 | 11. Authors' Addresses . . . . . . . . . . . . . . . . . . . . 9 | |||
12. Intellectual Property and Copyright Statements . . . . . . 9 | 12. Intellectual Property and Copyright Statements . . . . . 10 | |||
Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 10 | Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 10 | |||
Requirements Language | ||||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | ||||
document are to be interpreted as described in [RFC2119]. | ||||
1. Introduction | 1. Introduction | |||
The RDMA Transport for ONC RPC [RPCRDMA] allows an RPC client | The RDMA Transport for ONC RPC [RPCRDMA] allows an RPC client | |||
application to post buffers in a Chunk list for specific arguments | application to post buffers in a Chunk list for specific arguments | |||
and results from an RPC call. The RDMA transport header conveys this | and results from an RPC call. The RDMA transport header conveys this | |||
list of client buffer addresses to the server where the application | list of client buffer addresses to the server where the application | |||
can associate them with client data and use RDMA operations to | can associate them with client data and use RDMA operations to | |||
transfer the results directly to and from the posted buffers on the | transfer the results directly to and from the posted buffers on the | |||
client. The client and server must agree on a consistent mapping of | client. The client and server must agree on a consistent mapping of | |||
posted buffers to RPC. This document details the mapping for each | posted buffers to RPC. This document details the mapping for each | |||
version of the NFS protocol [RFC1831] [RFC1832] [RFC1094] [RFC1813] | version of the NFS protocol [RFC1831] [RFC1832] [RFC1094] [RFC1813] | |||
[RFC3530] [NFSv4.1]. | [RFC3530] [NFSv4.1]. | |||
2. Transfers from NFS Client to NFS Server | 2. Transfers from NFS Client to NFS Server | |||
The RDMA Read list, in the RDMA transport header, allows an RPC | The RDMA Read list, in the RDMA transport header, allows an RPC | |||
client to marshal RPC call data selectively. Large chunks of data, | client to marshal RPC call data selectively. Large chunks of data, | |||
such as the file data of an NFS WRITE request, may be referenced by | such as the file data of an NFS WRITE request, MAY be referenced by | |||
an RDMA Read list and be moved efficiently and directly-placed by an | an RDMA Read list and be moved efficiently and directly-placed by an | |||
RDMA READ operation initiated by the server. | RDMA READ operation initiated by the server. | |||
The process of identifying these chunks for the RDMA Read list can be | The process of identifying these chunks for the RDMA Read list can be | |||
implemented entirely within the RPC layer. It is transparent to the | implemented entirely within the RPC layer. It is transparent to the | |||
upper-level protocol, such as NFS. For instance, the file data | upper-level protocol, such as NFS. For instance, the file data | |||
portion of an NFS WRITE request can be selected as an RDMA "chunk" | portion of an NFS WRITE request can be selected as an RDMA "chunk" | |||
within the XDR marshalling code of RPC based on a size criterion, | within the XDR marshaling code of RPC based on a size criterion, | |||
independently of the NFS protocol layer. The XDR unmarshalling on the | independently of the NFS protocol layer. The XDR unmarshaling on the | |||
receiving system can identify the correspondence between Read chunks | receiving system can identify the correspondence between Read chunks | |||
and protocol elements via the XDR position value encoded in the Read | and protocol elements via the XDR position value encoded in the Read | |||
chunk entry. | chunk entry. | |||
RPC RDMA Read chunks are employed by this NFS mapping to convey | RPC RDMA Read chunks are employed by this NFS mapping to convey | |||
specific NFS data to the server in a manner which may be directly | specific NFS data to the server in a manner which may be directly | |||
placed. The following sections describe this mapping for versions of | placed. The following sections describe this mapping for versions of | |||
the NFS protocol. | the NFS protocol. | |||
3. Transfers from NFS Server to NFS Client | 3. Transfers from NFS Server to NFS Client | |||
skipping to change at page 3, line 42 | skipping to change at page 3, line 47 | |||
struct xdr_write_chunk { | struct xdr_write_chunk { | |||
struct xdr_rdma_segment target<>; | struct xdr_rdma_segment target<>; | |||
}; | }; | |||
struct xdr_write_list { | struct xdr_write_list { | |||
struct xdr_write_chunk entry; | struct xdr_write_chunk entry; | |||
struct xdr_write_list *next; | struct xdr_write_list *next; | |||
}; | }; | |||
The sum of the segment lengths yields the total size of the buffer, | The sum of the segment lengths yields the total size of the buffer, | |||
which must be large enough to accept the result. If the buffer is | which MUST be large enough to accept the result. If the buffer is | |||
too small, the server must return an XDR encode error. The server | too small, the server MUST return an XDR encode error. The server | |||
must return the result data for a posted buffer by progressively | MUST return the result data for a posted buffer by progressively | |||
filling its segments, perhaps leaving some trailing segments unfilled | filling its segments, perhaps leaving some trailing segments unfilled | |||
or partially full if the size of the result is less than the total | or partially full if the size of the result is less than the total | |||
size of the buffer segments. | size of the buffer segments. | |||
The server returns the RDMA Write list to the client with the segment | The server returns the RDMA Write list to the client with the segment | |||
length fields overwritten to indicate the amount of data RDMA Written | length fields overwritten to indicate the amount of data RDMA Written | |||
to each segment. Results returned by direct placement must not be | to each segment. Results returned by direct placement MUST not be | |||
returned by other methods, e.g. by read chunk list or inline. If no | returned by other methods, e.g. by read chunk list or inline. If no | |||
result data at all is returned for the element, the server places no | result data at all is returned for the element, the server places no | |||
data in the buffer(s), but does return zeroes in the segment length | data in the buffer(s), but does return zeroes in the segment length | |||
fields corresponding to the result. | fields corresponding to the result. | |||
The RDMA Write list allows the client to provide multiple result | The RDMA Write list allows the client to provide multiple result | |||
buffers - each buffer must map to a specific result in the reply. The | buffers - each buffer maps to a specific result in the reply. The NFS | |||
NFS client and server implementations must agree on the mapping of | client and server implementations agree by specifying the mapping of | |||
results to buffers for each RPC procedure. The following sections | results to buffers for each RPC procedure. The following sections | |||
describe this mapping for versions of the NFS protocol. | describe this mapping for versions of the NFS protocol. | |||
Through the use of RDMA Write lists in NFS requests, it is not | Through the use of RDMA Write lists in NFS requests, it is not | |||
necessary to employ the RDMA Read lists in the NFS replies, as | necessary to employ the RDMA Read lists in the NFS replies, as | |||
described in the RPC/RDMA protocol. This enables more efficient | described in the RPC/RDMA protocol. This enables more efficient | |||
operation, by avoiding the need for the server to expose buffers for | operation, by avoiding the need for the server to expose buffers for | |||
RDMA, and also avoiding "RDMA_DONE" exchanges. Clients may | RDMA, and also avoiding "RDMA_DONE" exchanges. Clients MAY | |||
additionally employ RDMA Reply chunks to receive entire messages, as | additionally employ RDMA Reply chunks to receive entire messages, as | |||
described in [RPCRDMA]. | described in [RPCRDMA]. | |||
4. NFS Versions 2 and 3 Mapping | 4. NFS Versions 2 and 3 Mapping | |||
A single RDMA Write list entry may be posted by the client to receive | A single RDMA Write list entry MAY be posted by the client to receive | |||
either the opaque file data from a READ request or the pathname from | either the opaque file data from a READ request or the pathname from | |||
a READLINK request. The server will ignore a Write list for any | a READLINK request. The server MUST ignore a Write list for any | |||
other NFS procedure, as well as any Write list entries beyond the | other NFS procedure, as well as any Write list entries beyond the | |||
first in the list. | first in the list. | |||
Similarly, a single RDMA Read list entry may be posted by the client | Similarly, a single RDMA Read list entry MAY be posted by the client | |||
to supply the opaque file data for a WRITE request or the pathname | to supply the opaque file data for a WRITE request or the pathname | |||
for a SYMLINK request. The server will ignore any Read list for | for a SYMLINK request. The server MUST ignore any Read list for | |||
other NFS procedures, as well as additional Read list entries beyond | other NFS procedures, as well as additional Read list entries beyond | |||
the first in the list. | the first in the list. | |||
Because there are no NFS version 2 or 3 requests that transfer bulk | Because there are no NFS version 2 or 3 requests that transfer bulk | |||
data in both directions, it is not necessary to post requests | data in both directions, it is not necessary to post requests | |||
containing both Write and Read lists. Any unneeded Read or Write | containing both Write and Read lists. Any unneeded Read or Write | |||
lists are ignored by the server. | lists are ignored by the server. | |||
In the case where the outgoing request or expected incoming reply is | In the case where the outgoing request or expected incoming reply is | |||
larger than the maximum size supported on the connection, it is | larger than the maximum size supported on the connection, it is | |||
possible for the RPC layer to post the entire message or result in a | possible for the RPC layer to post the entire message or result in a | |||
special "RDMA_NOMSG" message type which is transferred entirely by | special "RDMA_NOMSG" message type which is transferred entirely by | |||
RDMA. This is implemented in RPC, below NFS and therefore has no | RDMA. This is implemented in RPC, below NFS and therefore has no | |||
effect on the message contents. | effect on the message contents. | |||
Non-RDMA (inline) WRITE transfers may optionally employ the | Non-RDMA (inline) WRITE transfers MAY OPTIONALLY employ the | |||
"RDMA_MSGP" padding method described in the RPC/RDMA protocol, if the | "RDMA_MSGP" padding method described in the RPC/RDMA protocol, if the | |||
appropriate value for the server is known to the client. Padding | appropriate value for the server is known to the client. Padding | |||
allows the opaque file data to arrive at the server in an aligned | allows the opaque file data to arrive at the server in an aligned | |||
fashion, which may improve server performance. | fashion, which may improve server performance. | |||
The NFS version 2 and 3 protocols are frequently limited in practice | The NFS version 2 and 3 protocols are frequently limited in practice | |||
to requests containing less than or equal to 8 kilobytes and 32 | to requests containing less than or equal to 8 kilobytes and 32 | |||
kilobytes of data, respectively. In these cases, it is often | kilobytes of data, respectively. In these cases, it is often | |||
practical to support basic operation without employing a | practical to support basic operation without employing a | |||
configuration exchange as discussed in [RPCRDMA]. The server can | configuration exchange as discussed in [RPCRDMA]. The server MUST | |||
post buffers large enough to receive the largest possible incoming | post buffers large enough to receive the largest possible incoming | |||
message (approximately 12KB/36KB would be vastly sufficient in the | message (approximately 12KB for NFS version 2, or 36KB for NFS | |||
above cases), and the client can post buffers large enough to receive | version 3, would be vastly sufficient), and the client can post | |||
replies based on the "rsize" it is using to the server. Because the | buffers large enough to receive replies based on the "rsize" it is | |||
server will never return data in excess of this size, the client can | using to the server, plus a fixed overhead for the RPC and NFS | |||
be assured of the adequacy of its posted buffer sizes. | headers. Because the server MUST NOT return data in excess of this | |||
size, the client can be assured of the adequacy of its posted buffer | ||||
sizes. | ||||
Flow control is handled dynamically by the RPC RDMA protocol, and | Flow control is handled dynamically by the RPC RDMA protocol, and | |||
write padding is optional and therefore may remain unused. | write padding is OPTIONAL and therefore MAY remain unused. | |||
Alternatively, if the server is administratively configured to values | Alternatively, if the server is administratively configured to values | |||
appropriate for all its clients, the same assurance of | appropriate for all its clients, the same assurance of | |||
interoperability within the domain can be made. | interoperability within the domain can be made. | |||
The use of a configuration protocol with NFS v2 and v3 is therefore | The use of a configuration protocol with NFS v2 and v3 is therefore | |||
optional. Employing a configuration exchange may allow some advantage | OPTIONAL. Employing a configuration exchange may allow some advantage | |||
to server resource management through accurately sizing buffers, | to server resource management through accurately sizing buffers, | |||
enabling the server to know exactly how many RDMA Reads may be in | enabling the server to know exactly how many RDMA Reads may be in | |||
progress at once on the client connection, and enabling client write | progress at once on the client connection, and enabling client write | |||
padding which may be desirable for certain servers when RDMA Read is | padding which may be desirable for certain servers when RDMA Read is | |||
impractical. | impractical. | |||
5. NFS Version 4 Mapping | 5. NFS Version 4 Mapping | |||
This specification applies to the first minor version of NFS version | This specification applies to the first minor version of NFS version | |||
4 (NFSv4.0) and any subsequent minor versions that do not override | 4 (NFSv4.0) and any subsequent minor versions that do not override | |||
this mapping. | this mapping. | |||
The Write list will be considered only for the COMPOUND procedure. | The Write list MUST be considered only for the COMPOUND procedure. | |||
This procedure returns results from a sequence of operations. Only | This procedure returns results from a sequence of operations. Only | |||
the opaque file data from an NFS READ operation, and the pathname | the opaque file data from an NFS READ operation, and the pathname | |||
from a READLINK operation will utilize entries from the Write list. | from a READLINK operation MUST utilize entries from the Write list. | |||
If there is no Write list, i.e. the list is null, then any READ or | If there is no Write list, i.e. the list is null, then any READ or | |||
READLINK operations in the COMPOUND must return their data inline. | READLINK operations in the COMPOUND MUST return their data inline. | |||
The NFSv4.0 client must ensure that any result of its READ and | The NFSv4.0 client MUST ensure that any result of its READ and | |||
READLINK requests must fit within its receive buffers, or an RDMA | READLINK requests fits within its receive buffers, lest an RDMA | |||
transport error may occur. | transport error result upon transfer. | |||
The first entry in the Write list must be used by the first READ or | The first entry in the Write list MUST be used by the first READ or | |||
READLINK in the COMPOUND request. The next Write list entry by the | READLINK in the COMPOUND request. The next Write list entry by the | |||
by the next READ or READLINK, and so on. If there are more READ or | by the next READ or READLINK, and so on. If there are more READ or | |||
READLINK operations than Write list entries, then any remaining | READLINK operations than Write list entries, then any remaining | |||
operations must return their results inline. | operations MUST return their results inline. | |||
If a Write list entry is presented, then the corresponding READ or | If a Write list entry is presented, then the corresponding READ or | |||
READLINK must return its data via an RDMA WRITE to the buffer | READLINK MUST return its data via an RDMA WRITE to the buffer | |||
indicated by the Write list entry. If the Write list entry has zero | indicated by the Write list entry. If the Write list entry has zero | |||
RDMA segments, or if the total size of the segments is zero, then the | RDMA segments, or if the total size of the segments is zero, then the | |||
corresponding READ or READLINK operation must return its result | corresponding READ or READLINK operation MUST return its result | |||
inline. | inline. | |||
The following example shows an RDMA Write list with three posted | The following example shows an RDMA Write list with three posted | |||
buffers A, B, and C. The designated operations in the compound | buffers A, B, and C. The designated operations in the compound | |||
request, READ and READLINK, consume the posted buffers by writing | request, READ and READLINK, consume the posted buffers by writing | |||
their results back to each buffer. | their results back to each buffer. | |||
RDMA Write list: | RDMA Write list: | |||
A --> B --> C | A --> B --> C | |||
skipping to change at page 6, line 37 | skipping to change at page 6, line 45 | |||
Compound request: | Compound request: | |||
PUTFH LOOKUP READ PUTFH LOOKUP READLINK PUTFH LOOKUP READ | PUTFH LOOKUP READ PUTFH LOOKUP READLINK PUTFH LOOKUP READ | |||
| | | | | | | | |||
v v v | v v v | |||
A B C | A B C | |||
If the client does not want to have the READLINK result returned | If the client does not want to have the READLINK result returned | |||
directly, then it provides a zero length array of segment triplets | directly, then it provides a zero length array of segment triplets | |||
for buffer B or sets the values in the segment triplet for buffer B | for buffer B or sets the values in the segment triplet for buffer B | |||
to zeros so that the READLINK result will be returned inline. | to zeros so that the READLINK result MUST be returned inline. | |||
The situation is similar for RDMA Read lists sent by the client and | The situation is similar for RDMA Read lists sent by the client and | |||
applies to the NFSv4.0 WRITE and SYMLINK procedures as for v3. | applies to the NFSv4.0 WRITE and SYMLINK procedures as for v3. | |||
Additionally, inline segments too large to fit in posted buffers may | ||||
Additionally, inline segments too large to fit in posted buffers MAY | ||||
be transferred in special "RDMA_NOMSG" messages. | be transferred in special "RDMA_NOMSG" messages. | |||
Non-RDMA (inline) WRITE transfers may optionally employ the | Non-RDMA (inline) WRITE transfers MAY OPTIONALLY employ the | |||
"RDMA_MSGP" padding method described in the RPC/RDMA protocol, if the | "RDMA_MSGP" padding method described in the RPC/RDMA protocol, if the | |||
appropriate value for the server is known to the client. Padding | appropriate value for the server is known to the client. Padding | |||
allows the opaque file data to arrive at the server in an aligned | allows the opaque file data to arrive at the server in an aligned | |||
fashion, which may improve server performance. In order to ensure | fashion, which may improve server performance. In order to ensure | |||
accurate alignment for all data, it is likely that the client will | accurate alignment for all data, it is likely that the client will | |||
restrict its use of optional padding to COMPOUND requests containing | restrict its use of OPTIONAL padding to COMPOUND requests containing | |||
only a single WRITE operation. | only a single WRITE operation. | |||
Unlike NFS versions 2 and 3, the maximum size of an NFS version 4 | Unlike NFS versions 2 and 3, the maximum size of an NFS version 4 | |||
COMPOUND is unbounded, even when RDMA chunks are in use. While it | COMPOUND is unbounded, even when RDMA chunks are in use. While it | |||
might appear that a configuration protocol exchange (such as the one | might appear that a configuration protocol exchange (such as the one | |||
described in [RPCRDMA]) would help, in fact the layering issues | described in [RPCRDMA]) would help, in fact the layering issues | |||
involved in building COMPOUNDs by NFS make such a mechanism | involved in building COMPOUNDs by NFS make such a mechanism | |||
unworkable. Instead, an extension to NFS version 4 supporting a more | unworkable. | |||
comprehensive exchange of upper layer (NFSv4) parameters is proposed | ||||
in [NFSv4.1]. This proposal also addresses other use of the sizes, | However, typical NFS version 4 clients rarely issue such problematic | |||
such as in the server's response cache. | requests. In practice, they behave in much more predictable ways, in | |||
fact most still support the traditional rsize/wsize mount parameters. | ||||
Therefore, most NFS version 4 clients function over RPC/RDMA in the | ||||
same way as NFS versions 2 and 3, operationally. | ||||
There are however advantages to allowing both client and server to | ||||
operate with prearranged sie constraints, for example use of the | ||||
sizes to better manage the server's response cache. An extension to | ||||
NFS version 4 supporting a more comprehensive exchange of upper layer | ||||
parameters is part of [NFSv4.1]. | ||||
6. Security | 6. Security | |||
The RDMA transport for ONC RPC supports RPCSEC_GSS security as well | The RDMA transport for ONC RPC supports RPCSEC_GSS security as well | |||
as link-level security. The use of RDMA Write to return RPC results | as link-level security. The use of RDMA Write to return RPC results | |||
does not affect ONC RPC security. | does not affect ONC RPC security. | |||
7. IANA Considerations | 7. IANA Considerations | |||
NFS use of direct data placement may introduce a need for an | NFS use of direct data placement introduces a need for an additional | |||
additional NFS port number assignment for networks which share | NFS port number assignment for networks which share traditional UDP | |||
traditional UDP and TCP port spaces with RDMA services. The iWARP | and TCP port spaces with RDMA services. The iWARP [DDP] [RDMAP] | |||
[DDP] [RDMAP] protocol is such an example (Infiniband is not). | protocol is such an example (Infiniband is not). | |||
NFS servers for versions 2 and 3 [RFC1094] [RFC1813] traditionally | NFS servers for versions 2 and 3 [RFC1094] [RFC1813] traditionally | |||
listen for clients on UDP and TCP port 2049, and additionally, they | listen for clients on UDP and TCP port 2049, and additionally, they | |||
register these with the portmapper. NFS servers for version 4 | register these with the portmapper and/or rpcbind [RFC1833] service. | |||
[RFC3050] are required to listen on TCP port 2049, and are not | However, NFS servers for version 4 [RFC3530] are required by that | |||
required to register. | specification to listen on TCP port 2049, and are not required to | |||
register. | ||||
An NFS version 2 or version 3 server supporting RPC/RDMA on such a | An NFS version 2 or version 3 server supporting RPC/RDMA on such a | |||
network and registering itself with the RPC portmapper may choose an | network and registering itself with the RPC portmapper MAY choose an | |||
arbitrary port, or may be assigned an alternative well-known port | arbitrary port, or MAY use the alternative well-known port number for | |||
number for its RPC/RDMA service by IANA. The chosen port must be | its RPC/RDMA service by IANA. The chosen port MAY be registered with | |||
registered with the RPC portmapper under the netid assigned by the | the RPC portmapper under the netid assigned by the requirement in | |||
requirement in [RPCRDMA]. | [RPCRDMA]. | |||
An NFS version 4 server supporting RPC/RDMA on such a network must be | ||||
assigned an alternative well-known port number for its RPC/RDMA | ||||
service by IANA. Clients will connect to this well-known port | ||||
without consulting the RPC portmapper (as for NFSv4/TCP). | ||||
Any subsequent NFS version 4 minor version's [NFSv4.1] server may | An NFS version 4 server supporting RPC/RDMA on such a network must | |||
reuse port 2049, by requiring the client to perform the RDMA session | MUST use the alternative well-known port number for its RPC/RDMA | |||
negotiation supported by this protocol. If it does not require the | service by IANA. Clients SHOULD connect to this well-known port | |||
client to negotiate an RDMA-enabled session, it must use the | without consulting the RPC portmapper (as for NFSv4/TCP). The | |||
alternative port for RPC/RDMA, as for version 4. | following port is assigned to an NFS service over an RPC/RDMA | |||
transport: | ||||
This is not an issue on non-IP transports such as native Infiniband, | nfs-rdma 2050 | |||
where a non-colliding port translation scheme is used [IBPORT]. On | ||||
such interfaces, the server can simply listen on the port mapped from | ||||
the IANA-assigned NFS 2049, or any other port as assigned by the | ||||
native transport. Such assignments are out of the scope of IANA, and | ||||
of this document. | ||||
8. Acknowledgements | 8. Acknowledgements | |||
The authors would like to thank Dave Noveck and Chet Juszczak for | The authors would like to thank Dave Noveck and Chet Juszczak for | |||
their contributions to this document. | their contributions to this document. | |||
9. Normative References | 9. Normative References | |||
[RFC2119] | ||||
S. Bradner, "Key words for use in RFCs to Indicate Requirement | ||||
Levels", | ||||
Best Current Practice, | ||||
BCP 14, RFC 2119, March 1997. | ||||
[RFC1831] | [RFC1831] | |||
R. Srinivasan, "RPC: Remote Procedure Call Protocol Specification | R. Srinivasan, "RPC: Remote Procedure Call Protocol Specification | |||
Version 2", | Version 2", | |||
Standards Track RFC, | Standards Track RFC, | |||
http://www.ietf.org/rfc/rfc1831.txt | http://www.ietf.org/rfc/rfc1831.txt | |||
[RFC1832] | [RFC1832] | |||
R. Srinivasan, "XDR: External Data Representation Standard", | R. Srinivasan, "XDR: External Data Representation Standard", | |||
Standards Track RFC, | Standards Track RFC, | |||
http://www.ietf.org/rfc/rfc1832.txt | http://www.ietf.org/rfc/rfc1832.txt | |||
skipping to change at page 8, line 35 | skipping to change at page 9, line 4 | |||
[RFC1832] | [RFC1832] | |||
R. Srinivasan, "XDR: External Data Representation Standard", | R. Srinivasan, "XDR: External Data Representation Standard", | |||
Standards Track RFC, | Standards Track RFC, | |||
http://www.ietf.org/rfc/rfc1832.txt | http://www.ietf.org/rfc/rfc1832.txt | |||
[RFC1094] | [RFC1094] | |||
"NFS: Network File System Protocol Specification", | "NFS: Network File System Protocol Specification", | |||
(NFS version 2) Informational RFC, | (NFS version 2) Informational RFC, | |||
http://www.ietf.org/rfc/rfc1094.txt | http://www.ietf.org/rfc/rfc1094.txt | |||
[RFC1813] | [RFC1813] | |||
B. Callaghan, B. Pawlowski, P. Staubach, "NFS Version 3 Protocol | B. Callaghan, B. Pawlowski, P. Staubach, "NFS Version 3 Protocol | |||
Specification", | Specification", | |||
Informational RFC, | Informational RFC, | |||
http://www.ietf.org/rfc/rfc1813.txt | http://www.ietf.org/rfc/rfc1813.txt | |||
[RFC1833] | ||||
R. Srinivasan, "Binding Protocols for ONC RPC Version 2", | ||||
Standards Track RFC, | ||||
http://www.ietf.org/rfc/rfc1833.txt | ||||
[RFC3530] | [RFC3530] | |||
S. Shepler, B. Callaghan, D. Robinson, R. Thurlow, C. Beame, M. | S. Shepler, B. Callaghan, D. Robinson, R. Thurlow, C. Beame, M. | |||
Eisler, D. Noveck, "NFS version 4 Protocol", | Eisler, D. Noveck, "NFS version 4 Protocol", | |||
Standards Track RFC, | Standards Track RFC, | |||
http://www.ietf.org/rfc/rfc3530.txt | http://www.ietf.org/rfc/rfc3530.txt | |||
10. Informative References | 10. Informative References | |||
[RPCRDMA] | [RPCRDMA] | |||
T. Talpey, B. Callaghan, "RDMA Transport for ONC RPC" | T. Talpey, B. Callaghan, "RDMA Transport for ONC RPC" | |||
Internet Draft Work in Progress, | Internet Draft Work in Progress, | |||
draft-ietf-nfsv4-rpcrdma | draft-ietf-nfsv4-rpcrdma | |||
[NFSv4.1] | [NFSv4.1] | |||
S. Shepler, ed., "NFSv4 Minor Version 1" | S. Shepler et. al., ed., "NFSv4 Minor Version 1" | |||
Internet Draft Work in Progress, | Internet Draft Work in Progress, | |||
draft-ietf-nfsv4-minorversion1 | draft-ietf-nfsv4-minorversion1 | |||
[DDP] | [DDP] | |||
H. Shah et al, "Direct Data Placement over Reliable Transports", | H. Shah et al, "Direct Data Placement over Reliable Transports", | |||
Internet Draft Work in Progress, | Standards Track RFC, | |||
draft-ietf-rddp-ddp | draft-ietf-rddp-ddp | |||
[RDMAP] | [RDMAP] | |||
R. Recio et al, "An RDMA Protocol Specification", | R. Recio et al, "An RDMA Protocol Specification", | |||
Internet Draft Work in Progress, | Standards Track RFC, | |||
draft-ietf-rddp-rdmap | draft-ietf-rddp-rdmap | |||
[IBPORT] | ||||
Infiniband Trade Association, "IP Addressing Annex", | ||||
available from www.infinibandta.org | ||||
11. Authors' Addresses | 11. Authors' Addresses | |||
Tom Talpey | Tom Talpey | |||
Network Appliance, Inc. | Network Appliance, Inc. | |||
375 Totten Pond Road | 375 Totten Pond Road | |||
Waltham, MA 02451 USA | Waltham, MA 02451 USA | |||
Phone: +1 781 768 5329 | Phone: +1 781 768 5329 | |||
EMail: thomas.talpey@netapp.com | EMail: thomas.talpey@netapp.com | |||
Brent Callaghan | Brent Callaghan | |||
Apple Computer, Inc. | Apple Computer, Inc. | |||
End of changes. 46 change blocks. | ||||
83 lines changed or deleted | 98 lines changed or added | |||
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |