draft-ietf-nfsv4-layout-types-05.txt   draft-ietf-nfsv4-layout-types-06.txt 
NFSv4 T. Haynes NFSv4 T. Haynes
Internet-Draft Primary Data Internet-Draft Primary Data
Updates: 5661 (if approved) July 20, 2017 Updates: 5661 (if approved) August 16, 2017
Intended status: Standards Track Intended status: Standards Track
Expires: January 21, 2018 Expires: February 17, 2018
Requirements for pNFS Layout Types Requirements for pNFS Layout Types
draft-ietf-nfsv4-layout-types-05.txt draft-ietf-nfsv4-layout-types-06.txt
Abstract Abstract
This document defines the requirements which individual pNFS layout This document defines the requirements which individual pNFS layout
types need to meet in order to work within the parallel NFS (pNFS) types need to meet in order to work within the parallel NFS (pNFS)
framework as defined in RFC5661. In so doing, it aims to more framework as defined in RFC5661. In so doing, it aims to clearly
clearly distinguish between requirements for pNFS as a whole and distinguish between requirements for pNFS as a whole and those
those those specifically directed to the pNFS File Layout. The lack specifically directed to the pNFS File Layout. The lack of a clear
of a clear separation between the two set of requirements has been separation between the two set of requirements has been troublesome
troublesome for those specifying and evaluating new Layout Types. In for those specifying and evaluating new Layout Types. In this
this regard, this document effectively updates RFC5661. regard, this document effectively updates RFC5661.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 21, 2018. This Internet-Draft will expire on February 17, 2018.
Copyright Notice Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 16 skipping to change at page 2, line 16
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1. Use of the Terms "Data Server" and "Storage Device" . . . 5 2.1. Use of the Terms "Data Server" and "Storage Device" . . . 5
2.2. Requirements Language . . . . . . . . . . . . . . . . . . 6 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 6
3. The Control Protocol . . . . . . . . . . . . . . . . . . . . 6 3. The Control Protocol . . . . . . . . . . . . . . . . . . . . 6
3.1. Protocol REQUIREMENTS . . . . . . . . . . . . . . . . . . 7 3.1. Control Protocol REQUIREMENTS . . . . . . . . . . . . . . 8
3.2. Undocumented Protocol REQUIREMENTS . . . . . . . . . . . 9 3.2. Previously Undocumented Protocol REQUIREMENTS . . . . . . 9
3.3. Editorial Requirements . . . . . . . . . . . . . . . . . 10 3.3. Editorial Requirements . . . . . . . . . . . . . . . . . 10
4. Specifications of Existing Layout Types . . . . . . . . . . . 10 4. Specifications of Original Layout Types . . . . . . . . . . . 11
4.1. File Layout Type . . . . . . . . . . . . . . . . . . . . 10 4.1. File Layout Type . . . . . . . . . . . . . . . . . . . . 11
4.2. Block Layout Type . . . . . . . . . . . . . . . . . . . . 12 4.2. Block Layout Type . . . . . . . . . . . . . . . . . . . . 12
4.3. Object Layout Type . . . . . . . . . . . . . . . . . . . 13 4.3. Object Layout Type . . . . . . . . . . . . . . . . . . . 13
5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6. Security Considerations . . . . . . . . . . . . . . . . . . . 14 6. Security Considerations . . . . . . . . . . . . . . . . . . . 14
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 15
8.1. Normative References . . . . . . . . . . . . . . . . . . 15 8.1. Normative References . . . . . . . . . . . . . . . . . . 15
8.2. Informative References . . . . . . . . . . . . . . . . . 15 8.2. Informative References . . . . . . . . . . . . . . . . . 15
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 15 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 16
Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 15 Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 16
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 15 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 16
1. Introduction 1. Introduction
The concept of layout type has a central role in the definition and The concept of layout type has a central role in the definition and
implementation of Parallel Network File System (pNFS). Clients and implementation of Parallel Network File System (pNFS). Clients and
servers implementing different layout types behave differently in servers implementing different layout types behave differently in
many ways while conforming to the overall pNFS framework defined in many ways while conforming to the overall pNFS framework defined in
[RFC5661] and this document. Layout types may differ in: [RFC5661] and this document. Layout types may differ as to:
o The method used to do I/O operations directed to data storage o The method used to do I/O operations directed to data storage
devices. devices.
o The requirements for communication between the metadata server o The requirements for communication between the metadata server
(MDS) and the storage devices. (MDS) and the storage devices.
o The means used to ensure that I/O requests are only processed when o The means used to ensure that I/O requests are only processed when
the client holds an appropriate layout. the client holds an appropriate layout.
o The format and interpretation of nominally opaque data fields in o The format and interpretation of nominally opaque data fields in
pNFS-related NFSv4.x data structures. pNFS-related NFSv4.x data structures.
Such matters are defined in a standards-track layout type Such matters are defined in a standards-track layout type
specification. Except for the files layout type, which was defined specification. Except for the files layout type, which was defined
in Section 13 of [RFC5661], existing layout types are defined in in Section 13 of [RFC5661], existing layout types are defined in
their own standards-track documents and it is anticipated that new their own standards-track documents and it is anticipated that new
layout type will be defined in similar documents. layout types will be defined in similar documents.
The file layout type was defined in the Network File System (NFS) The file layout type was defined in the Network File System (NFS)
version 4.1 protocol specification [RFC5661]. The block layout type version 4.1 protocol specification [RFC5661]. The block layout type
was defined in [RFC5663] and the object layout type was in turn was defined in [RFC5663] while the object layout type was defined in
defined in [RFC5664]. [RFC5664]. Subsequently, the SCSI layout type was defined in
[RFC8154].
Some implementers have interpreted the text in Sections 12 ("Parallel Some implementers have interpreted the text in Sections 12 ("Parallel
NFS (pNFS)") and 13 ("NFSv4.1 as a Storage Protocol in pNFS: the File NFS (pNFS)") and 13 ("NFSv4.1 as a Storage Protocol in pNFS: the File
Layout Type") of [RFC5661] as both being applying only to the file Layout Type") of [RFC5661] as both being applying only to the file
layout type. Because Section 13 was not covered in a separate layout type. Because Section 13 was not covered in a separate
standards-track document like those for both the block and object standards-track document such as those for both the block and object
layout types, there had been some confusion as to the layout types, there had been some confusion as to the
responsibilities of both the metadata server and the data servers responsibilities of both the metadata server and the data servers
(DS) which were laid out in Section 12. (DS) which were laid out in Section 12.
As a consequence, new internet drafts (see [FlexFiles] and [Lustre]) As a consequence, new internet drafts (see [FlexFiles] and [Lustre])
may struggle to meet the requirements to be a pNFS layout type. This may struggle to meet the requirements to be a pNFS layout type. This
document specifies the layout type independent requirements placed on document gathers the requirements from all of the original layout
all layout types, whether one of the original three or any new type standard documents and then specifies the requirements placed on
variant. all layout types independent of the particular type chosen.
2. Definitions 2. Definitions
control communication requirements: define for a layout type the control communication requirements: are for a layout type the
details regarding information on layouts, stateids, file metadata, details regarding information on layouts, stateids, file metadata,
and file data which must be communicated between the metadata and file data which must be communicated between the metadata
server and the storage devices. server and the storage devices.
control protocol: defines a particular mechanism that an control protocol: is the particular mechanism that an implementation
implementation of a layout type would use to meet the control of a layout type would use to meet the control communication
communication requirement for that layout type. This need not be requirement for that layout type. This need not be a protocol as
a protocol as normally understood. In some cases the same normally understood. In some cases the same protocol may be used
protocol my be used as a control protocol and data access as a control protocol and data access protocol.
protocol.
(file) data: is that part of the file system object which contains (file) data: is that part of the file system object which contains
the data to read or writen. It is the contents of the object and the data to read or written. It is the contents of the object
not the attributes of the object. rather than the attributes of the object.
data server (DS): is a pNFS server which provides the file's data data server (DS): is a pNFS server which provides the file's data
when the file system object is accessed over a file-based when the file system object is accessed over a file-based
protocol. Note that this usage differs from that in [RFC5661] protocol. Note that this usage differs from that in [RFC5661]
which applies the term in some cases even when other sorts of which applies the term in some cases even when other sorts of
protocols are being used. Depending on the layout, there might be protocols are being used. Depending on the layout, there might be
one or more data servers over which the data is striped. While one or more data servers over which the data is striped. While
the metadata server is strictly accessed over the NFSv4.1 the metadata server is strictly accessed over the NFSv4.1
protocol, depending on the layout type, the data server could be protocol, the data server could be accessed via any file access
accessed via any file access protocol that meets the pNFS protocol that meets the pNFS requirements.
requirements.
See Section 2.1 for a comparison of this term and "data storage See Section 2.1 for a comparison of this term and "data storage
device". device".
fencing: is the process by which the metadata server prevents the fencing: is the process by which the metadata server prevents the
storage devices from processing I/O from a specific client to a storage devices from processing I/O from a specific client to a
specific file. specific file.
layout: contains information a client uses to access file data on a layout: is the information a client uses to access file data on a
storage device. This information will include specification of storage device. This information will include specification of
the protocol (layout type) and the identity of the storage devices the protocol (layout type) and the identity of the storage devices
to be used. to be used.
The bulk of the contents of the layout are defined in [RFC5661] The bulk of the contents of the layout are defined in [RFC5661]
as nominally opaque, but individual layout types may specify their as nominally opaque, but individual layout types are responsible
own interpretation of layout data. for specifying the format of the layout data.
layout iomode: see Section 1. layout iomode: is a grant of either read or read/write I/O to the
client.
layout stateid: is a 128-bit quantity returned by a server that layout stateid: is a 128-bit quantity returned by a server that
uniquely defines the layout state provided by the server for a uniquely defines the layout state provided by the server for a
specific layout that describes a layout type and file (see specific layout that describes a layout type and file (see
Section 12.5.2 of [RFC5661]). Further, Section 12.5.3 describes Section 12.5.2 of [RFC5661]). Further, Section 12.5.3 describes
differences in handling between layout stateids and other stateid differences in handling between layout stateids and other stateid
types. types.
layout type: describes both the storage protocol used to access the layout type: is a specification of both the storage protocol used to
data and the aggregation scheme used to lay out the file data on access the data and the aggregation scheme used to lay out the
the underlying storage devices. file data on the underlying storage devices.
loose coupling: describes when the control protocol, between a loose coupling: is when the control protocol is a storage protocol.
metadata server and storage device, is a storage protocol.
(file) metadata: is that part of the file system object that (file) metadata: is that part of the file system object that
contains various descriptive data relevant to the file object, as contains various descriptive data relevant to the file object, as
opposed to the file data itself. This could include the time of opposed to the file data itself. This could include the time of
last modification, access time, eof position, etc. last modification, access time, end-of-file (EOF) position, etc.
metadata server (MDS): is the pNFS server which provides metadata metadata server (MDS): is the pNFS server which provides metadata
information for a file system object. It also is responsible for information for a file system object. It also is responsible for
generating, recalling, and revoking layouts for file system generating, recalling, and revoking layouts for file system
objects, for performing directory operations, and for performing I objects, for performing directory operations, and for performing I
/O operations to regular files when the clients direct these to /O operations to regular files when the clients direct these to
the metadata server itself. the metadata server itself.
recalling a layout: occurs when the metadata server issues a callbck recalling a layout: is a graceful recall, via a callback, of a
to inform the client that the layout is to be returned in a specific layout by the metadata server to the client. Graceful
graceful manner. Note that the client could be able to flush any here means that the client would have the opportunity to flush any
writes, etc., before replying to the metadata server. writes, etc., before returning the layout to the metadata server.
revoking a layout: occurs when the metadata server invalidates a revoking a layout: is an invalidation of a specific layout by the
specific layout Once revocation occurs, the metadata server will metadata server. Once revocation occurs, the metadata server will
not accept as valid any reference to the revoked layout and a not accept as valid any reference to the revoked layout and a
storage device will not accept any client access based on the storage device will not accept any client access based on the
layout. layout.
stateid: is a 128-bit quantity returned by a server that uniquely stateid: is a 128-bit quantity returned by a server that uniquely
defines the set of locking-related state provided by the server. defines the set of locking-related state provided by the server.
Stateids may designate state related to open files, to byte-range Stateids may designate state related to open files, to byte-range
locks, to delegations, or to layouts. locks, to delegations, or to layouts.
storage device: designates the target to which clients may direct I/ storage device: is the target to which clients may direct I/O
O requests when they hold an appropriate layout. Note that each requests when they hold an appropriate layout. Note that each
data server is a storage device but that some storage device are data server is a storage device but that some storage device are
not data servers. See Section 2.1 for further discussion. not data servers. See Section 2.1 for further discussion.
storage protocol: is the protocol used by clients to do I/O storage protocol: is the protocol used by clients to do I/O
operations to the storage device, Each layout type may specify its operations to the storage device. Each layout type specifies the
own storage protocol. It is possible for a layout type to specify set of storage protocols.
multiple access protocols.
tight coupling: describes when the control protocol, between a tight coupling: is when the control protocol is one designed
metadata server and storage device, is either a propritary specifically for that purpose. It may be either a proprietary
approach or based on a standards-track document. protocol, adapted specifically to a a particular metadata server,
or one based on a standards-track document.
2.1. Use of the Terms "Data Server" and "Storage Device" 2.1. Use of the Terms "Data Server" and "Storage Device"
In [RFC5661], these the two terms of "Data Server" and "Storage In [RFC5661], these two terms of "Data Server" and "Storage Device"
Device" are used somewhat inconsistently: are used somewhat inconsistently:
o In chapter 12, where pNFS in general is discussed, the term o In chapter 12, where pNFS in general is discussed, the term
"storage device" is used. "storage device" is used.
o In chapter 13, where the file layout type is discussed, the term o In chapter 13, where the file layout type is discussed, the term
"data server" is used. "data server" is used.
o In other chapters, the term "data server" is used, even in o In other chapters, the term "data server" is used, even in
contexts where the storage access type is not NFSv4.1 or any other contexts where the storage access type is not NFSv4.1 or any other
file access protocol. file access protocol.
As this document deals with pNFS in general, it uses the more generic As this document deals with pNFS in general, it uses the more generic
term "storage device" in preference to "data server". The term "data term "storage device" in preference to "data server". The term "data
server" is used only in contexts in which a file server is used as a server" is used only in contexts in which a file server is used as a
storage device. Note that every data server is a storage device but storage device. Note that every data server is a storage device but
that storage devices which use protocols which are not file access storage devices which use protocols which are not file access
protocol are not data servers. protocols (such as NFS) are not data servers.
Since a given storage device may support multiple layout types, a Since a given storage device may support multiple layout types, a
given device can potentially act as a data server for some set of given device can potentially act as a data server for some set of
storage protocols while simultaneously acting as a non-data-server storage protocols while simultaneously acting as a storage device for
storage device for others. others.
2.2. Requirements Language 2.2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
This document differs from most standards-track documents in that it
specifies requirements for those defining future layout types rather
than defining the requirements for implementations directly. This
document makes clear whether:
(1) any particular requirement applies to implementations.
(2) any particular requirement applies to those defining layout
types.
(3) the requirement is a general requirement which implementations
need to conform to, with the specific means left to layout type
definitions type to specify.
3. The Control Protocol 3. The Control Protocol
In Section 12.2.6 of [RFC5661], the control protocol was introduced. One of the key requirements of a layout type is the need for a
There have been no published specifications for control protocols as mechanism to be used to meet the requirements that apply to the
yet. The control protocol denotes any mechanism used to meet the interaction between the metadata server and the storage device such
requirements that apply to the interaction between the metadata that they present a consistent interface to the client
server and the storage device such that they present a consistent (Section 12.2.6 of [RFC5661]). Particular implementations may
interface to the client. Particular implementations may satisfy this satisfy this requirement in any manner they choose and the mechanism
requirement in any manner they choose and the mechanism chosen may chosen may not be described as a protocol. Specifications defining
not be described as a protocol. Specifications defining layout types layout types need to clearly show how implementations can meet the
need to clearly show how implementations can meet the requirements requirements discussed below, especially with respect to those that
discussed below, especially with respect to those that have security have security implications. In addition, such specifications may
implications. In addition, such specifications may find it necessary find it necessary to impose requirements on implementations of the
to impose requirements on implementations of the layout type to layout type to ensure appropriate interoperability.
ensure appropriate interoperability.
In some cases, there may be no control protocol other than the In some cases, there may be no control protocol other than the
storage protocol. This is often described as using a "loose storage protocol. This is often described as using a "loose
coupling" model. In such cases, the assumption is that the metadata coupling" model. In such cases, the assumption is that the metadata
server, storage devices, and client may be changed independently and server, storage devices, and client may be changed independently and
that the implementation requirements in the layout type specification that the implementation requirements in the layout type specification
need to ensure this degree of interoperability. This model is used need to ensure this degree of interoperability. This model is used
in the block and object layout type specification. in the block and object layout type specification.
In some cases, there may be no control protocol other than the In other cases, it is assumed that there will be a purpose-built
storage In other cases, it is assumed that there may be purpose-built
control protocol which may be different for different implementations control protocol which may be different for different implementations
of the metadata server and data server. In such cases, the of the metadata server and data server. The assumption here is that
assumption is that the metadata server and data servers are designed the metadata server and data servers are designed and implemented as
and implemented as a unit and interoperability needs to be assured a unit and interoperability needs to be assured between clients and
between clients and metadata-data server pairs, developed metadata-data server pairs, developed independently. This is the
independently. This is the model used for the files layout. model used for the files layout.
In some cases, there may be no control protocol other than the Another possibility is for the definition of a control protocol to be
storage Another possibility, not so far realized, is for the specified in a standards-track document. There are two subcases to
definition of a control protocol to be specified in a standards-track consider:
document. There are two subcases to consider:
o A new layout type includes a definition of a particular control o A new layout type includes a definition of a particular control
protocol whose use is obligatory for metadata serverss and storage protocol whose use is obligatory for metadata servers and storage
devices implementing the layout type. In this case the devices implementing the layout type. In this case the
interoperability model is similar to the first case above and the interoperability model is similar to the first case above and the
defining document should assure interoperability among metadata defining document should assure interoperability among metadata
servers, storage devices, and clients developed independently. servers, storage devices, and clients developed independently.
o A control protocol is defined in a standards-track document which o A control protocol is defined in a standards-track document which
meets the control protocol requirements for one of the existing meets the control protocol requirements for one of the existing
layout types. In this case, the new document's job is to assure layout types. In this case, the new document's job is to assure
interoperability between metadata servers and storage devices interoperability between metadata servers and storage devices
developed separately. The existing definition document for the developed separately. The existing definition document for the
selected layout type retains the function of assuring selected layout type retains the function of assuring
interoperability between clients and a given collection of interoperability between clients and a given collection of
metadata servers and storage devices. In this context, metadata servers and storage devices. In this context,
implementations that implement the new protocol are treated in the implementations that implement the new protocol are treated in the
same way as those that use an internal control protocol or a same way as those that use an internal control protocol or a
functional equivalent. functional equivalent.
3.1. Protocol REQUIREMENTS An example of this last case is the SCSI layout type [RFC8154], which
extends the block layout type. The block layout type had a
requirement for fencing of clients, but did not present a way for the
control protocol (in this case the SCSI storage protocol) to fence
the client. The SCSI layout type remedies that in [RFC8154] and in
effect has a tightly coupled model.
The REQUIREMENTS of such interactions between the metadata server and 3.1. Control Protocol REQUIREMENTS
the storage devices are:
The REQUIREMENTS of interactions between the metadata server and the
storage devices are:
(1) The metadata server MUST be able to service the client's I/O (1) The metadata server MUST be able to service the client's I/O
requests if the client decides to make such requests to the requests if the client decides to make such requests to the
metadata server instead of to the storage device. The metadata metadata server instead of to the storage device. The metadata
server must be able to retrieve the data from the constituent server must be able to retrieve the data from the constituent
storage devices and present it back to the client. A corollary storage devices and present it back to the client. A corollary
to this is that even though the metadata server has successfully to this is that even though the metadata server has successfully
given the client a layout, the client MAY still send I/O given the client a layout, the client MAY still send I/O
requests to the metadata server. requests to the metadata server.
Whether the metadata server allows access over other protocols
(e.g., NFSv3, Server Message Block (SMB), etc) is strictly an
implementation choice, just as it is in the case of any other
(i.e., non-pNFS-supporting) NFSv4.1 server.
(2) The metadata server MUST be able to restrict access to a file on (2) The metadata server MUST be able to restrict access to a file on
the storage devices when it revokes a layout. The metadata the storage devices when it revokes a layout. The metadata
server typically would revoke a layout whenever a client fails server typically would revoke a layout whenever a client fails
to respond to a recall or a client's lease is expired due to to respond to a recall or a client's lease is expired due to
non-renewal. It might also revoke the layout as a means of non-renewal. It might also revoke the layout as a means of
enforcing a change in locking state or access permissions that enforcing a change in locking state or access permissions that
the storage device cannot directly enforce. the storage device cannot directly enforce.
Effective revocation may require client co-operation in using a Effective revocation may require client co-operation in using a
particular stateid (files layout) or principal (e,g., flexible particular stateid (files layout) or principal (e,g., flexible
files layout) when performing I/O. files layout) when performing I/O.
(3) A pNFS impelementation MUST NOT remove NFSv4.1's access (3) A pNFS implementation MUST NOT allow the violation of NFSv4.1's
controls: ACLs and file open modes. While Section 12.9 of access controls: ACLs and file open modes. Section 12.9 of
[RFC5661] specifically lays this burden on the combination of [RFC5661] specifically lays this burden on the combination of
clients, storage devices, and the metadata server, depending on clients, storage devices, and the metadata server. However the
the implementation, there might be a requirement that the specification of the individual layout type might create
metadata server update the storage device such that it can requirements as to how this is to be done. This may include a
enforce security. possible requirement for the metadata server to update the
storage device so that it can enforce security.
The file layout requires the storage device to enforce access The file layout requires the storage device to enforce access
whereas the flex file layout requires both the storage device whereas the flex file layout requires both the storage device
and the client to enforce security. and the client to enforce security.
(4) Locking MUST be respected. (4) Interactions between locking and I/O operations MUST obey
existing semantic restrictions. In particular, if an I/O
operation would be invalid when directed at the metadata server,
it is not to be allowed when performed on the storage device.
(5) The metadata server and the storage devices MUST agree on (5) Any disagreement between the metadata server and the data server
attributes like modify time, the change attribute, and the end- as to the value of attributes such as modify time, the change
of-file (EOF) position. attribute, and the EOF position MUST be of limited duration with
clear means of resolution of any discrepancies being provided.
Note that
(a) "Agree" in the sense that some while state changes need not (a) Discrepancies need not be resolved unless any client has
be propagated immediately, they must be propagated when accessed the file in question via the metadata server,
accessed by the client. This access is typically in typically by performing a GETATTR.
response to a GETATTR of those attributes.
(b) A particular storage device might be striped such it knows (b) A particular storage device might be striped such it has no
nothing about the EOF position. It still meets the information regarding the EOF position.
requirement of agreeing on that fact with the metadata
server.
(c) Both clock skew and network delay can lead to the metadata (c) Both clock skew and network delay can lead to the metadata
server and the storage device having different concepts of server and the storage device having different values of
the time attributes. As long as those differences can be the time attributes. As long as those differences can be
accounted for what is presented to the client in a GETATTR, accounted for in what is presented to the client in a
then the two "agree". GETATTR, then no violation results.
(d) A LAYOUTCOMMIT requires that storage device generated (d) A LAYOUTCOMMIT requires that changes in attributes
changes in attributes need be reflected in the metadata resulting from operations on the storage device need to be
server by the completion of the operation. reflected in the metadata server by the completion of the
operation.
These requirements may be satisfied in different ways by different These requirements may be satisfied in different ways by different
layout types. As an example, while the file layout type does use the layout types. As an example, while the file layout type uses the
stateid to fence off the client, there is no requirement that other stateid to fence off the client, there is no requirement that other
layout types use this stateid approach. layout types use this stateid approach.
Each new standards-track document for a layout types MUST address how Each new standards-track document for a layout types MUST address how
the client, metadata server, and storage devices interact to meet the client, metadata server, and storage devices are to interact to
these requirements. meet these requirements.
3.2. Undocumented Protocol REQUIREMENTS 3.2. Previously Undocumented Protocol REQUIREMENTS
In gathering the requirements from Section 12 of [RFC5661], there are While not explicitly stated as requirements in Section 12 of
some which are notable in their absence: [RFC5661], the existing layout types do have more requirements that
they need to enforce.
The client has these obligations when making I/O requests to the
storage devices:
(1) Clients MUST NOT perform I/O to the storage device if they do (1) Clients MUST NOT perform I/O to the storage device if they do
not have layouts for the files in question. not have layouts for the files in question.
(2) Clients MUST be allowed to perform I/O to the metadata server (2) Clients MUST NOT perform I/O operations outside of the specified
even if they already have a LAYOUT. A layout type might
discourage such I/O, but it can not forbid it.
(3) Clients MUST NOT perform I/O operations outside of the specified
ranges in the layout segment. ranges in the layout segment.
(4) Clients MUST NOT perform I/O operations which would be (3) Clients MUST NOT perform I/O operations which would be
inconsistent with the iomode specified in the layout segments it inconsistent with the iomode specified in the layout segments it
holds. holds.
(5) The metadata server MUST be able to do allocation and Under the file layout type, the storage devices are able to reject
deallocation of storage. I.e., creating and deleting files. any request made not conforming to these requirements. This may not
be possible for other known layout types, which puts the burden of
enforcing such violations solely on the client. For these layout
types:
Under the file layout type, the storage devices are able to meet all (1) The metadata server MIGHT use fencing operations to the storage
of these requirements. However, this is not the case with the other devices to enforce layout revocation against the client.
known layout types, Instead, the burden is shifted to both:
(1) The client itself. (2) The metadata server MUST allow the clients to perform data I/O
against it, even if it has already granted the client a layout.
A layout type might discourage such I/O, but it can not forbid
it.
(2) The interaction of the metadata server and the client. (3) The metadata server MUST be able to do storage allocation,
whether that is to create, delete, extend, or truncate files.
The metadata server is responsible for giving the client enough The means to address these requirements will vary with the layout
information to make informed decisions and for trusting the client type. A control protocol will be used to effect these, whether a
implementation to do so. This communication would be through the purpose-built one, one identical to the storage protocol, or a new
callback operatios available to the metadata server, e.g., recalling standards-track control protocol.
a layout, a delegation, etc.
3.3. Editorial Requirements 3.3. Editorial Requirements
This section discusses how the protocol requirements discussed above This section discusses how the protocol requirements discussed above
need to be addressed in documents specifying a new layout type. need to be addressed in documents specifying a new layout type.
Depending on the interoperability model for the layout type in Depending on the interoperability model for the layout type in
question, this may involve the imposition of layout-type-specific question, this may involve the imposition of layout-type-specific
requirements that ensure appropriate interoperability of pNFS requirements that ensure appropriate interoperability of pNFS
components which are developed separately. components which are developed separately.
skipping to change at page 10, line 25 skipping to change at page 10, line 47
client, metadata server, and storage device act together to meet the client, metadata server, and storage device act together to meet the
protocol requirements discussed previously. If the document does not protocol requirements discussed previously. If the document does not
impose implementation requirements sufficient to ensure that these impose implementation requirements sufficient to ensure that these
semantic requirements are met, it is not appropriate for the working semantic requirements are met, it is not appropriate for the working
group to allow the document to move forward. group to allow the document to move forward.
Some examples include: Some examples include:
o If the metadata server does not have a means to invalidate a o If the metadata server does not have a means to invalidate a
stateid issued to the storage device to keep a particular client stateid issued to the storage device to keep a particular client
from accessing a specific file, then the layout type spefication from accessing a specific file, then the layout type specification
has to document how the metadata server is going to fence the has to document how the metadata server is going to fence the
client from access to the file on that storage device. client from access to the file on that storage device.
o If the metadata server implements mandatory byte-range locking o If the metadata server implements mandatory byte-range locking
when accessed directly by the client, it must do so when data is when accessed directly by the client, it must do so when data is
read or written using the designated storage protocol. read or written using the designated storage protocol.
4. Specifications of Existing Layout Types 4. Specifications of Original Layout Types
This section is not normative with regards to each of the presented This section is not normative with regards to each of the presented
types. This document does not update the specification of either the types. This document does not update the specification of either the
block layout type (see [RFC5663]) or the object layout type (see block layout type (see [RFC5663]) or the object layout type (see
[RFC5664]). Nor does it update Section 13 of [RFC5661], but rather [RFC5664]). Nor does it update Section 13 of [RFC5661], but rather
Section 12 of that document. In other words, it is the pNFS Section 12 of that document. In other words, it is the pNFS
requirements being updated, not the specification of the file layout requirements being updated rather than the specification of the file
type. layout type.
4.1. File Layout Type 4.1. File Layout Type
Because the storage protocol is a subset of NFSv4.1, the semantics of Because the storage protocol is a subset of NFSv4.1, the semantics of
the file layout type comes closest to the semantics of NFSv4.1 in the the file layout type comes closest to the semantics of NFSv4.1 in the
absence of pNFS. In particular, the stateid and principal used for I absence of pNFS. In particular, the stateid and principal used for I
/O MUST have the same effect and be subject to the same validation on /O MUST have the same effect and be subject to the same validation on
a data server as it would if the I/O were being performed on the a data server as it would have if the I/O were being performed on the
metadata server itself. The same set of validations apply whether metadata server itself. The same set of validations are applied
pNFS is in effect or not. whether pNFS is in effect or not.
And while for most implementations the storage devices can do the And while for most implementations the storage devices can do the
following validations: following validations:
(1) client holds a valid layout, (1) client holds a valid layout,
(2) client I/O matches the layout iomode, and, (2) client I/O matches the layout iomode, and,
(3) client does not go out of the byte ranges, (3) client does not go out of the byte ranges,
these are each presented as a "SHOULD" and not a "MUST". Actually, these are each presented as a "SHOULD" and not a "MUST". Actually,
the first point is presented as both: the first point is presented in [RFC5661] as both:
"MUST": in Section 13.6 of [RFC5661] "MUST": in Section 13.6
"As described in Section 12.5.1, a client MUST NOT send an I/O to "As described in Section 12.5.1, a client MUST NOT send an I/O to
a data server for which it does not hold a valid layout; the data a data server for which it does not hold a valid layout; the data
server MUST reject such an I/O." server MUST reject such an I/O."
"SHOULD": in Section 13.8 of [RFC5661] "SHOULD": in Section 13.8
"The iomode need not be checked by the data servers when clients "The iomode need not be checked by the data servers when clients
perform I/O. However, the data servers SHOULD still validate that perform I/O. However, the data servers SHOULD still validate that
the client holds a valid layout and return an error if the client the client holds a valid layout and return an error if the client
does not." does not."
However, it is just these layout specific checks that are optional, It should be noted that it is just these layout specific checks that
not the normal file access semantics. The storage devices MUST make are optional, not the normal file access semantics. The storage
all of the required access checks on each READ or WRITE I/O as devices MUST make all of the required access checks on each READ or
determined by the NFSv4.1 protocol. If the metadata server would WRITE I/O as determined by the NFSv4.1 protocol. If the metadata
deny a READ or WRITE operation on a file due to its ACL, mode server would deny a READ or WRITE operation on a file due to its ACL,
attribute, open access mode, open deny mode, mandatory byte-range mode attribute, open access mode, open deny mode, mandatory byte-
lock state, or any other attributes and state, the storage device range lock state, or any other attributes and state, the storage
MUST also deny the READ or WRITE operation. And note that while the device MUST also deny the READ or WRITE operation. Also while the
NFSv4.1 protocol does not mandate export access checks based on the NFSv4.1 protocol does not mandate export access checks based on the
client's IP address, if the metadata server implements such a policy, client's IP address, if the metadata server implements such a policy,
then that counts as such state as outlined above. then that counts as such state as outlined above.
The data filehandle provided by the PUTFH operation to the data The data filehandle provided by the PUTFH operation to the data
server is sufficient to ensure that for the subsequent READ or WRITE server provides sufficient context to enable the data server to
operation in the compound, that the client has a valid layout for the ensure that for the subsequent READ or WRITE operation in the
I/O being performed. compound, that the client has a valid layout for the I/O being
performed.
Finally, the data server can check the stateid presented in the READ Finally, the data server can check the stateid presented in the READ
or WRITE operation to see if that stateid has been rejected by the or WRITE operation to see if that stateid has been rejected by the
metadata server such to cause the I/O to be fenced. Whilst it might metadata server in order to cause the I/O to be fenced. Whilst it
just be the open owner or lock owner on that client being fenced, the might just be the open owner or lock owner on that client being
client should take the NFS4ERR_BAD_STATEID error code to mean it has fenced, the client should take the NFS4ERR_BAD_STATEID error code to
been fenced from the file and contact the metadata server. mean it has been fenced from the file and contact the metadata
server.
4.2. Block Layout Type 4.2. Block Layout Type
With the block layout type, the storage devices are not guaranteed to With the block layout type, the storage devices are generally not
be able to enforce file-based security. Typically, storage area able to enforce file-based security. Typically, storage area network
network (SAN) disk arrays and SAN protocols provide access control (SAN) disk arrays and SAN protocols provide coarse-grained access
mechanisms (e.g., Logical Unit Number (LUN) mapping and/or masking), control mechanisms (e.g., Logical Unit Number (LUN) mapping and/or
which operate at the granularity of individual hosts, not individual masking), with a target granularity of disks rather than individual
blocks. Access to block storage is logically at a lower layer of the blocks and a source granularity of individual hosts rather than of
I/O stack than NFSv4, and hence NFSv4 security is not directly users or owners. Access to block storage is logically at a lower
applicable to protocols that access such storage directly. As such, layer of the I/O stack than NFSv4. Since NFSv4 security is not
directly applicable to protocols that access such storage directly,
Section 2.1 [RFC5663] specifies that: Section 2.1 [RFC5663] specifies that:
"in environments where pNFS clients cannot be trusted to enforce "in environments where pNFS clients cannot be trusted to enforce
such policies, pNFS block layout types SHOULD NOT be used." such policies, pNFS block layout types SHOULD NOT be used."
As a result of these granularity issues, the security burden has been Due to these granularity issues, the security burden has been shifted
shifted from the storage devices to the client. Those deploying from the storage devices to the client. Those deploying
implementations of this layout type need to be sure that the client implementations of this layout type need to be sure that the client
implementation can be trusted This is not a new sort of requirement implementation can be trusted This is not a new sort of requirement
in the context of SAN protocols. In such environments, the client is in the context of SAN protocols. In such environments, the client is
expected to provide block-based protection. expected to provide block-based protection.
This shift of the burden also extends to locks and layouts. The This shift of the burden also extends to locks and layouts. The
storage devices are not able to enforce any of these and the burden storage devices are not able to enforce any of these and the burden
is pushed to the client to make the appropriate checks before sending is pushed to the client to make the appropriate checks before sending
I/O to the storage devices. For example, the server may use a layout I/O to the storage devices. For example, the server may use a layout
iomode only allowing reading to enforce a mandatory read-only lock, iomode only allowing reading to enforce a mandatory read-only lock,
In such cases, the client has to support that use by not sending In such cases, the client has to support that use by not sending
WRITEs to the storage devices. The fundamental issue here is that WRITEs to the storage devices. The fundamental issue here is that
the storage device is treated by this layout type as a local dumb the storage device is treated by this layout type in the same fashion
disk. Once the client has access to the storage device, it is able as a local disk device. Once the client has access to the storage
to perform both READ and WRITE I/O to the entire storage device. The device, it is able to perform both READ and WRITE I/O to the entire
byte ranges in the layout, any locks, the layout iomode, etc, can storage device. The byte ranges in the layout, any locks, the layout
only be enforced by the client. Therefore, the client is required to iomode, etc, can only be enforced by the client. Therefore, the
provide that enforcement. client is required to provide that enforcement.
In the context of fencing off of the client upon revocation of a In the context of fencing off of the client upon revocation of a
layout, these limitations come into play again, i.e., the granularity layout, these limitations come into play again, i.e., the granularity
of the fencing can only be at the host/logical-unit level. Thus, if of the fencing can only be at the host/logical-unit level. Thus, if
one of a client's layouts is revoked by the server, it will one of a client's layouts is revoked by the server, it will
effectively revoke all of the client's layouts for files located on effectively revoke all of the client's layouts for files located on
the storage units comprising the logical volume. This may extend to the storage units comprising the logical volume. This may extend to
the client's layouts for files in other file systems. Clients need the client's layouts for files in other file systems. Clients need
to be prepared for such revocations and reacquire layouts as needed. to be prepared for such revocations and reacquire layouts as needed.
skipping to change at page 13, line 30 skipping to change at page 14, line 4
objects corresponding to the requested iomode. These capabilities objects corresponding to the requested iomode. These capabilities
are used to enforce access control and locking semantics at the are used to enforce access control and locking semantics at the
storage devices. Whenever one of the following occur on the metadata storage devices. Whenever one of the following occur on the metadata
server: server:
o the permissions on the object change, o the permissions on the object change,
o a conflicting mandatory byte-range lock is granted, or o a conflicting mandatory byte-range lock is granted, or
o a layout is revoked and reassigned to another client, o a layout is revoked and reassigned to another client,
then the metadata server MUST change the capability version attribute
then the metadate server MUST change the capability version attribute
on all objects comprising the file to in order to invalidate any on all objects comprising the file to in order to invalidate any
outstanding capabilities before committing to one of these changes. outstanding capabilities before committing to one of these changes.
When the metadata server wishes to fence off a client to a particular When the metadata server wishes to fence off a client to a particular
object, then it can use the above approach to invalidate the object, then it can use the above approach to invalidate the
capability attribute on the given object. The client can be informed capability attribute on the given object. The client can be informed
via the storage device that the capability has been rejected and is via the storage device that the capability has been rejected and is
allowed to fetch a refreshed set of capabilities, i.e., re-acquire allowed to fetch a refreshed set of capabilities, i.e., re-acquire
the layout. the layout.
5. Summary 5. Summary
In the three published layout types, the burden of enforcing the In the three original layout types, the burden of enforcing the
security of NFSv4.1 can fall to either the storage devices (files), security of NFSv4.1 can fall to either the storage devices (files),
the client (blocks), or the metadata server (objects). Such choices the client (blocks), or the metadata server (objects). Such choices
are conditioned by the native capabilities of the storage devices - are conditioned by the native capabilities of the storage devices -
if a control protocol can be implemented, then the burden can be if a control protocol can be implemented, then the burden can be
shifted primarily to the storage devices. shifted primarily to the storage devices.
In the context of this document, we treat the control protocol as a In the context of this document, we treat the control protocol as a
set of requirements. And as new layout types are published, the set of requirements. And as new layout types are published, the
defining documents MUST address: defining documents MUST address:
skipping to change at page 14, line 31 skipping to change at page 14, line 50
This section does not deal directly with security considerations for This section does not deal directly with security considerations for
existing or new layout types. Instead, it provides a general existing or new layout types. Instead, it provides a general
framework for understating security-related issues within the pNFS framework for understating security-related issues within the pNFS
framework. Specific security considerations will be addressed in the framework. Specific security considerations will be addressed in the
Security Considerations sections of documents specifying layout Security Considerations sections of documents specifying layout
types. types.
The layout type specification must ensure that only data accesses The layout type specification must ensure that only data accesses
consistent with the NFSV4.1 security model are allowed. It may do consistent with the NFSV4.1 security model are allowed. It may do
this directly, by providing that appropriate checks be performed at this directly, by providing that appropriate checks be performed at
the time the access is performed. It may do it indirectly by the time each access is performed. It may do it indirectly by
allowing the client or the storage device to be responsible for allowing the client or the storage device to be responsible for
making the appropriate checks. In the latter case, I/O access writes making the appropriate checks. In the latter case, I/O access writes
are reflected in layouts and the layout type must provide a way to are reflected in layouts and the layout type must provide a way to
prevent inappropriate access due to permissions changes between the prevent inappropriate access due to permissions changes between the
time a layout is granted and the time the access is performed. time a layout is granted and the time the access is performed.
The metadata server MUST be able to fence off a client's access to The metadata server MUST be able to fence off a client's access to
the data file on a storage device. When it revokes the layout, the the data file on a storage device. When it revokes the layout, the
client's access MUST be terminated at the storage devices. The client's access MUST be terminated at the storage devices. The
client the has the opportunity to re-acquire the layout and perform client has a subsequent opportunity to re-acquire the layout and
the security check in the context of the newly current access perform the security check in the context of the newly current access
permissions. permissions.
7. IANA Considerations 7. IANA Considerations
This document has no actions for IANA. This document has no actions for IANA.
8. References 8. References
8.1. Normative References 8.1. Normative References
skipping to change at page 15, line 22 skipping to change at page 15, line 37
[RFC5661] Shepler, S., Eisler, M., and D. Noveck, "Network File [RFC5661] Shepler, S., Eisler, M., and D. Noveck, "Network File
System (NFS) Version 4 Minor Version 1 Protocol", RFC System (NFS) Version 4 Minor Version 1 Protocol", RFC
5661, January 2010. 5661, January 2010.
[RFC5663] Black, D., Fridella, S., and J. Glasgow, "pNFS Block/ [RFC5663] Black, D., Fridella, S., and J. Glasgow, "pNFS Block/
Volume Layout", RFC 5663, January 2010. Volume Layout", RFC 5663, January 2010.
[RFC5664] Halevy, B., Welch, B., and J. Zelenka, "Object-Based [RFC5664] Halevy, B., Welch, B., and J. Zelenka, "Object-Based
Parallel NFS (pNFS) Operations", RFC 5664, January 2010. Parallel NFS (pNFS) Operations", RFC 5664, January 2010.
[RFC8154] Hellwig, C., "Parallel NFS (pNFS) Small Computer System
Interface (SCSI) Layout", RFC 8154, DOI 10.17487/RFC8154,
May 2017, <http://www.rfc-editor.org/info/rfc8154>.
8.2. Informative References 8.2. Informative References
[FlexFiles] [FlexFiles]
Halevy, B. and T. Haynes, "Parallel NFS (pNFS) Flexible Halevy, B. and T. Haynes, "Parallel NFS (pNFS) Flexible
File Layout", draft-ietf-nfsv4-flex-files-11 (Work In File Layout", draft-ietf-nfsv4-flex-files-11 (Work In
Progress), July 2017. Progress), July 2017.
[Lustre] Faibish, S. and P. Tao, "Parallel NFS (pNFS) Lustre Layout [Lustre] Faibish, S. and P. Tao, "Parallel NFS (pNFS) Lustre Layout
Operations", draft-faibish-nfsv4-pnfs-lustre-layout-07 Operations", draft-faibish-nfsv4-pnfs-lustre-layout-07
(Work In Progress), April 2014. (Work In Progress), April 2014.
Appendix A. Acknowledgments Appendix A. Acknowledgments
Dave Noveck provided an early review that sharpened the clarity of Dave Noveck provided an early review that sharpened the clarity of
the definitions. He also provided a more comprehensive review of the the definitions. He also provided a more comprehensive review of the
document. document.
Both Chuck Lever and Christoph Helwig provided insightful comments
during the WGLC.
Appendix B. RFC Editor Notes Appendix B. RFC Editor Notes
[RFC Editor: please remove this section prior to publishing this [RFC Editor: please remove this section prior to publishing this
document as an RFC] document as an RFC]
[RFC Editor: prior to publishing this document as an RFC, please [RFC Editor: prior to publishing this document as an RFC, please
replace all occurrences of RFCTBD10 with RFCxxxx where xxxx is the replace all occurrences of RFCTBD10 with RFCxxxx where xxxx is the
RFC number of this document] RFC number of this document]
Author's Address Author's Address
 End of changes. 81 change blocks. 
197 lines changed or deleted 227 lines changed or added

This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/