draft-ietf-nfsv4-layout-types-04.txt   draft-ietf-nfsv4-layout-types-05.txt 
NFSv4 T. Haynes NFSv4 T. Haynes
Internet-Draft Primary Data Internet-Draft Primary Data
Updates: 5661 (if approved) January 28, 2016 Updates: 5661 (if approved) July 20, 2017
Intended status: Standards Track Intended status: Standards Track
Expires: July 31, 2016 Expires: January 21, 2018
Requirements for pNFS Layout Types Requirements for pNFS Layout Types
draft-ietf-nfsv4-layout-types-04.txt draft-ietf-nfsv4-layout-types-05.txt
Abstract Abstract
This document provides help in distinguishing between the This document defines the requirements which individual pNFS layout
requirements for Network File System (NFS) version 4.1's Parallel NFS types need to meet in order to work within the parallel NFS (pNFS)
(pNFS) and those those specifically directed to the pNFS File Layout. framework as defined in RFC5661. In so doing, it aims to more
The lack of a clear separation between the two set of requirements clearly distinguish between requirements for pNFS as a whole and
has been troublesome for those specifying and evaluating new Layout those those specifically directed to the pNFS File Layout. The lack
Types. As this document clarifies RFC5661, it effectively updates of a clear separation between the two set of requirements has been
RFC5661. troublesome for those specifying and evaluating new Layout Types. In
this regard, this document effectively updates RFC5661.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on July 31, 2016. This Internet-Draft will expire on January 21, 2018.
Copyright Notice Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1. Difference Between a Data Server and a Storage Device . . 4 2.1. Use of the Terms "Data Server" and "Storage Device" . . . 5
2.2. Requirements Language . . . . . . . . . . . . . . . . . . 4 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 6
3. The Control Protocol . . . . . . . . . . . . . . . . . . . . 4 3. The Control Protocol . . . . . . . . . . . . . . . . . . . . 6
3.1. Protocol Requirements . . . . . . . . . . . . . . . . . . 5 3.1. Protocol REQUIREMENTS . . . . . . . . . . . . . . . . . . 7
3.2. Non-protocol Requirements . . . . . . . . . . . . . . . . 6 3.2. Undocumented Protocol REQUIREMENTS . . . . . . . . . . . 9
3.3. Editorial Requirements . . . . . . . . . . . . . . . . . 6 3.3. Editorial Requirements . . . . . . . . . . . . . . . . . 10
4. Implementations in Existing Layout Types . . . . . . . . . . 7 4. Specifications of Existing Layout Types . . . . . . . . . . . 10
4.1. File Layout Type . . . . . . . . . . . . . . . . . . . . 7 4.1. File Layout Type . . . . . . . . . . . . . . . . . . . . 10
4.2. Block Layout Type . . . . . . . . . . . . . . . . . . . . 7 4.2. Block Layout Type . . . . . . . . . . . . . . . . . . . . 12
4.3. Object Layout Type . . . . . . . . . . . . . . . . . . . 8 4.3. Object Layout Type . . . . . . . . . . . . . . . . . . . 13
5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 6. Security Considerations . . . . . . . . . . . . . . . . . . . 14
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 15
8.1. Normative References . . . . . . . . . . . . . . . . . . 10 8.1. Normative References . . . . . . . . . . . . . . . . . . 15
8.2. Informative References . . . . . . . . . . . . . . . . . 10 8.2. Informative References . . . . . . . . . . . . . . . . . 15
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 10 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 15
Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 10 Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 15
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 11 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 15
1. Introduction 1. Introduction
Both Parallel Network File System (pNFS) and the File Layout Type The concept of layout type has a central role in the definition and
were defined in the Network File System (NFS) version 4.1 protocol implementation of Parallel Network File System (pNFS). Clients and
specification, [RFC5661]. The Block Layout Type was defined in servers implementing different layout types behave differently in
[RFC5663] and the Object Layout Type was in turn defined in many ways while conforming to the overall pNFS framework defined in
[RFC5664]. [RFC5661] and this document. Layout types may differ in:
o The method used to do I/O operations directed to data storage
devices.
o The requirements for communication between the metadata server
(MDS) and the storage devices.
o The means used to ensure that I/O requests are only processed when
the client holds an appropriate layout.
o The format and interpretation of nominally opaque data fields in
pNFS-related NFSv4.x data structures.
Such matters are defined in a standards-track layout type
specification. Except for the files layout type, which was defined
in Section 13 of [RFC5661], existing layout types are defined in
their own standards-track documents and it is anticipated that new
layout type will be defined in similar documents.
The file layout type was defined in the Network File System (NFS)
version 4.1 protocol specification [RFC5661]. The block layout type
was defined in [RFC5663] and the object layout type was in turn
defined in [RFC5664].
Some implementers have interpreted the text in Sections 12 ("Parallel Some implementers have interpreted the text in Sections 12 ("Parallel
NFS (pNFS)") and 13 ("NFSv4.1 as a Storage Protocol in pNFS: the File NFS (pNFS)") and 13 ("NFSv4.1 as a Storage Protocol in pNFS: the File
Layout Type") of [RFC5661] as both being strictly for the File Layout Layout Type") of [RFC5661] as both being applying only to the file
Type. I.e., since Section 13 was not covered in a separate RFC like layout type. Because Section 13 was not covered in a separate
those for both the Block and Object Layout Types, there is some standards-track document like those for both the block and object
confusion as to the responsibilities of both the Metadata Server layout types, there had been some confusion as to the
(MDS) and the Data Servers (DS) which were laid out in Section 12. responsibilities of both the metadata server and the data servers
(DS) which were laid out in Section 12.
As a consequence, new internet drafts (see [FlexFiles] and [Lustre]) As a consequence, new internet drafts (see [FlexFiles] and [Lustre])
may struggle to meet the requirements to be a pNFS Layout Type. This may struggle to meet the requirements to be a pNFS layout type. This
document clarifies what are the Layout Type independent requirements document specifies the layout type independent requirements placed on
placed on all Layout Types, whether one of the original three or any all layout types, whether one of the original three or any new
new variant. variant.
2. Definitions 2. Definitions
control protocol: is a set of requirements for the communication of control communication requirements: define for a layout type the
information on layouts, stateids, file metadata, and file data details regarding information on layouts, stateids, file metadata,
between the metadata server and the storage devices. and file data which must be communicated between the metadata
server and the storage devices.
(file) data: is that part of the file system object which describes control protocol: defines a particular mechanism that an
the payload and not the object. E.g., it is the file contents. implementation of a layout type would use to meet the control
communication requirement for that layout type. This need not be
a protocol as normally understood. In some cases the same
protocol my be used as a control protocol and data access
protocol.
Data Server (DS): is one of the pNFS servers which provide the (file) data: is that part of the file system object which contains
contents of a file system object which is a regular file. the data to read or writen. It is the contents of the object and
Depending on the layout, there might be one or more data servers not the attributes of the object.
over which the data is striped. Note that while the metadata
server is strictly accessed over the NFSv4.1 protocol, depending
on the Layout Type, the data server could be accessed via any
protocol that meets the pNFS requirements.
fencing: is when the metadata server prevents the storage devices data server (DS): is a pNFS server which provides the file's data
from processing I/O from a specific client to a specific file. when the file system object is accessed over a file-based
protocol. Note that this usage differs from that in [RFC5661]
which applies the term in some cases even when other sorts of
protocols are being used. Depending on the layout, there might be
one or more data servers over which the data is striped. While
the metadata server is strictly accessed over the NFSv4.1
protocol, depending on the layout type, the data server could be
accessed via any file access protocol that meets the pNFS
requirements.
layout: informs a client of which storage devices it needs to See Section 2.1 for a comparison of this term and "data storage
communicate with (and over which protocol) to perform I/O on a device".
file. The layout might also provide some hints about how the
storage is physically organized.
layout iomode: describes whether the layout granted to the client is fencing: is the process by which the metadata server prevents the
for read or read/write I/O. storage devices from processing I/O from a specific client to a
specific file.
layout: contains information a client uses to access file data on a
storage device. This information will include specification of
the protocol (layout type) and the identity of the storage devices
to be used.
The bulk of the contents of the layout are defined in [RFC5661]
as nominally opaque, but individual layout types may specify their
own interpretation of layout data.
layout iomode: see Section 1.
layout stateid: is a 128-bit quantity returned by a server that layout stateid: is a 128-bit quantity returned by a server that
uniquely defines the layout state provided by the server for a uniquely defines the layout state provided by the server for a
specific layout that describes a Layout Type and file (see specific layout that describes a layout type and file (see
Section 12.5.2 of [RFC5661]). Further, Section 12.5.3 describes Section 12.5.2 of [RFC5661]). Further, Section 12.5.3 describes
the difference between a layout stateid and a normal stateid. differences in handling between layout stateids and other stateid
types.
Layout Type: describes both the storage protocol used to access the layout type: describes both the storage protocol used to access the
data and the aggregation scheme used to lays out the file data on data and the aggregation scheme used to lay out the file data on
the underlying storage devices. the underlying storage devices.
(file) metadata: is that part of the file system object which loose coupling: describes when the control protocol, between a
describes the object and not the payload. E.g., it could be the metadata server and storage device, is a storage protocol.
time since last modification, access, etc.
Metadata Server (MDS): is the pNFS server which provides metadata (file) metadata: is that part of the file system object that
contains various descriptive data relevant to the file object, as
opposed to the file data itself. This could include the time of
last modification, access time, eof position, etc.
metadata server (MDS): is the pNFS server which provides metadata
information for a file system object. It also is responsible for information for a file system object. It also is responsible for
generating layouts for file system objects. Note that the MDS is generating, recalling, and revoking layouts for file system
responsible for directory-based operations. objects, for performing directory operations, and for performing I
/O operations to regular files when the clients direct these to
the metadata server itself.
recalling a layout: is when the metadata server uses a back channel recalling a layout: occurs when the metadata server issues a callbck
to inform the client that the layout is to be returned in a to inform the client that the layout is to be returned in a
graceful manner. Note that the client could be able to flush any graceful manner. Note that the client could be able to flush any
writes, etc., before replying to the metadata server. writes, etc., before replying to the metadata server.
revoking a layout: is when the metadata server invalidates the revoking a layout: occurs when the metadata server invalidates a
layout such that neither the metadata server nor any storage specific layout Once revocation occurs, the metadata server will
device will accept any access from the client with that layout. not accept as valid any reference to the revoked layout and a
storage device will not accept any client access based on the
layout.
stateid: is a 128-bit quantity returned by a server that uniquely stateid: is a 128-bit quantity returned by a server that uniquely
defines the open and locking states provided by the server for a defines the set of locking-related state provided by the server.
specific open-owner or lock-owner/open-owner pair for a specific Stateids may designate state related to open files, to byte-range
file and type of lock. locks, to delegations, or to layouts.
storage device: is another term used almost interchangeably with storage device: designates the target to which clients may direct I/
data server. See Section 2.1 for the nuances between the two. O requests when they hold an appropriate layout. Note that each
data server is a storage device but that some storage device are
not data servers. See Section 2.1 for further discussion.
2.1. Difference Between a Data Server and a Storage Device storage protocol: is the protocol used by clients to do I/O
operations to the storage device, Each layout type may specify its
own storage protocol. It is possible for a layout type to specify
multiple access protocols.
We defined a data server as a pNFS server, which implies that it can tight coupling: describes when the control protocol, between a
utilize the NFSv4.1 protocol to communicate with the client. As metadata server and storage device, is either a propritary
such, only the File Layout Type would currently meet this approach or based on a standards-track document.
requirement. The more generic concept is a storage device, which can
use any protocol to communicate with the client. The requirements
for a storage device to act together with the metadata server to
provide data to a client are that there is a Layout Type
specification for the given protocol and that the metadata server has
granted a layout to the client. Note that nothing precludes there
being multiple supported Layout Types (i.e., protocols) between a
metadata server, storage devices, and client.
As storage device is the more encompassing terminology, this document 2.1. Use of the Terms "Data Server" and "Storage Device"
utilizes it over data server.
In [RFC5661], these the two terms of "Data Server" and "Storage
Device" are used somewhat inconsistently:
o In chapter 12, where pNFS in general is discussed, the term
"storage device" is used.
o In chapter 13, where the file layout type is discussed, the term
"data server" is used.
o In other chapters, the term "data server" is used, even in
contexts where the storage access type is not NFSv4.1 or any other
file access protocol.
As this document deals with pNFS in general, it uses the more generic
term "storage device" in preference to "data server". The term "data
server" is used only in contexts in which a file server is used as a
storage device. Note that every data server is a storage device but
that storage devices which use protocols which are not file access
protocol are not data servers.
Since a given storage device may support multiple layout types, a
given device can potentially act as a data server for some set of
storage protocols while simultaneously acting as a non-data-server
storage device for others.
2.2. Requirements Language 2.2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
3. The Control Protocol 3. The Control Protocol
In Section 12.2.6 of [RFC5661], the control protocol is introduced. In Section 12.2.6 of [RFC5661], the control protocol was introduced.
There have been no specifications for control protocols, and indeed There have been no published specifications for control protocols as
there need not be such a protocol in use for any given yet. The control protocol denotes any mechanism used to meet the
implementation. The control protocol is actually a set of requirements that apply to the interaction between the metadata
requirements provided to describe the interaction between the server and the storage device such that they present a consistent
metadata server and the storage device. When specifying a new Layout interface to the client. Particular implementations may satisfy this
Type, the defining document MUST show how it meets these requirement in any manner they choose and the mechanism chosen may
requirements, especially with respect to the security implications. not be described as a protocol. Specifications defining layout types
need to clearly show how implementations can meet the requirements
discussed below, especially with respect to those that have security
implications. In addition, such specifications may find it necessary
to impose requirements on implementations of the layout type to
ensure appropriate interoperability.
3.1. Protocol Requirements In some cases, there may be no control protocol other than the
storage protocol. This is often described as using a "loose
coupling" model. In such cases, the assumption is that the metadata
server, storage devices, and client may be changed independently and
that the implementation requirements in the layout type specification
need to ensure this degree of interoperability. This model is used
in the block and object layout type specification.
The broad requirements of such interactions between the metadata In some cases, there may be no control protocol other than the
server and the storage devices are: storage In other cases, it is assumed that there may be purpose-built
control protocol which may be different for different implementations
of the metadata server and data server. In such cases, the
assumption is that the metadata server and data servers are designed
and implemented as a unit and interoperability needs to be assured
between clients and metadata-data server pairs, developed
independently. This is the model used for the files layout.
(1) NFSv4.1 clients MUST be able to access a file directly through In some cases, there may be no control protocol other than the
the metadata server and not the storage device. I.e., the storage Another possibility, not so far realized, is for the
metadata server must be able to retrieve the data from the definition of a control protocol to be specified in a standards-track
constituent storage devices and present it back to the client document. There are two subcases to consider:
via normal NFSv4.1 operations. Whether the metadata server
allows access over other protocols (e.g., NFSv3, Server Message o A new layout type includes a definition of a particular control
Block (SMB), etc) is strictly an implementation choice. protocol whose use is obligatory for metadata serverss and storage
devices implementing the layout type. In this case the
interoperability model is similar to the first case above and the
defining document should assure interoperability among metadata
servers, storage devices, and clients developed independently.
o A control protocol is defined in a standards-track document which
meets the control protocol requirements for one of the existing
layout types. In this case, the new document's job is to assure
interoperability between metadata servers and storage devices
developed separately. The existing definition document for the
selected layout type retains the function of assuring
interoperability between clients and a given collection of
metadata servers and storage devices. In this context,
implementations that implement the new protocol are treated in the
same way as those that use an internal control protocol or a
functional equivalent.
3.1. Protocol REQUIREMENTS
The REQUIREMENTS of such interactions between the metadata server and
the storage devices are:
(1) The metadata server MUST be able to service the client's I/O
requests if the client decides to make such requests to the
metadata server instead of to the storage device. The metadata
server must be able to retrieve the data from the constituent
storage devices and present it back to the client. A corollary
to this is that even though the metadata server has successfully
given the client a layout, the client MAY still send I/O
requests to the metadata server.
Whether the metadata server allows access over other protocols
(e.g., NFSv3, Server Message Block (SMB), etc) is strictly an
implementation choice, just as it is in the case of any other
(i.e., non-pNFS-supporting) NFSv4.1 server.
(2) The metadata server MUST be able to restrict access to a file on (2) The metadata server MUST be able to restrict access to a file on
the storage devices when it revokes a layout. The metadata the storage devices when it revokes a layout. The metadata
server typically would revoke a layout whenever a client fails server typically would revoke a layout whenever a client fails
to respond to a recall or fails to renew its lease in time. It to respond to a recall or a client's lease is expired due to
might also revoke the layout as a means of enforcing a change in non-renewal. It might also revoke the layout as a means of
state that the storage device cannot directly enforce with the enforcing a change in locking state or access permissions that
client. the storage device cannot directly enforce.
(3) Storage devices MUST NOT remove NFSv4.1's access controls: ACLs Effective revocation may require client co-operation in using a
and file open modes. particular stateid (files layout) or principal (e,g., flexible
files layout) when performing I/O.
(3) A pNFS impelementation MUST NOT remove NFSv4.1's access
controls: ACLs and file open modes. While Section 12.9 of
[RFC5661] specifically lays this burden on the combination of
clients, storage devices, and the metadata server, depending on
the implementation, there might be a requirement that the
metadata server update the storage device such that it can
enforce security.
The file layout requires the storage device to enforce access
whereas the flex file layout requires both the storage device
and the client to enforce security.
(4) Locking MUST be respected. (4) Locking MUST be respected.
(5) The metadata server and the storage devices MUST agree on (5) The metadata server and the storage devices MUST agree on
attributes like modify time, the change attribute, and the end- attributes like modify time, the change attribute, and the end-
of-file (EOF) position. of-file (EOF) position.
Note that "agree" here means that some state changes need not be (a) "Agree" in the sense that some while state changes need not
propagated immediately, although all changes SHOULD be be propagated immediately, they must be propagated when
propagated promptly. accessed by the client. This access is typically in
response to a GETATTR of those attributes.
Note that there is no requirement on how these are implemented. (b) A particular storage device might be striped such it knows
While the File Layout Type does use the stateid to fence off the nothing about the EOF position. It still meets the
client, there is no requirement that other Layout Types use this requirement of agreeing on that fact with the metadata
stateid approach. But the other Layout Types MUST document how the server.
client, metadata server, and storage devices interact to meet these
requirements.
3.2. Non-protocol Requirements (c) Both clock skew and network delay can lead to the metadata
server and the storage device having different concepts of
the time attributes. As long as those differences can be
accounted for what is presented to the client in a GETATTR,
then the two "agree".
(d) A LAYOUTCOMMIT requires that storage device generated
changes in attributes need be reflected in the metadata
server by the completion of the operation.
These requirements may be satisfied in different ways by different
layout types. As an example, while the file layout type does use the
stateid to fence off the client, there is no requirement that other
layout types use this stateid approach.
Each new standards-track document for a layout types MUST address how
the client, metadata server, and storage devices interact to meet
these requirements.
3.2. Undocumented Protocol REQUIREMENTS
In gathering the requirements from Section 12 of [RFC5661], there are In gathering the requirements from Section 12 of [RFC5661], there are
some which are notable in their absence: some which are notable in their absence:
(1) Storage device MUST honor the byte range restrictions present in (1) Clients MUST NOT perform I/O to the storage device if they do
the layout. I.e., if the layout only provides access to the not have layouts for the files in question.
first 2 MB of the file, then any access after that MUST NOT be
granted.
(2) The enforcement of authentication and authorization so that (2) Clients MUST be allowed to perform I/O to the metadata server
restrictions that would be enforced by the metadata server are even if they already have a LAYOUT. A layout type might
also enforced by the storage device. Examples include both discourage such I/O, but it can not forbid it.
export access checks and if the layout has an iomode of
LAYOUTIOMODE4_READ, then if the client attempts to write, the I/
O may be rejected.
While storage devices should make such checks on the layout (3) Clients MUST NOT perform I/O operations outside of the specified
iomode, [RFC5661] does not mandate that all Layout Types have to ranges in the layout segment.
make such checks.
(3) The allocation and deallocation of storage. I.e., creating and (4) Clients MUST NOT perform I/O operations which would be
deleting files. inconsistent with the iomode specified in the layout segments it
holds.
Of these, the first two are of concern to this draft and Layout Types (5) The metadata server MUST be able to do allocation and
SHOULD honor them if at all possible, deallocation of storage. I.e., creating and deleting files.
Under the file layout type, the storage devices are able to meet all
of these requirements. However, this is not the case with the other
known layout types, Instead, the burden is shifted to both:
(1) The client itself.
(2) The interaction of the metadata server and the client.
The metadata server is responsible for giving the client enough
information to make informed decisions and for trusting the client
implementation to do so. This communication would be through the
callback operatios available to the metadata server, e.g., recalling
a layout, a delegation, etc.
3.3. Editorial Requirements 3.3. Editorial Requirements
In addition to these protocol requirements, there are two editorial This section discusses how the protocol requirements discussed above
requirements for drafts that present a new Layout Type. At a need to be addressed in documents specifying a new layout type.
minimum, the specification needs to address: Depending on the interoperability model for the layout type in
question, this may involve the imposition of layout-type-specific
requirements that ensure appropriate interoperability of pNFS
components which are developed separately.
(1) The approach the new Layout Type takes towards fencing clients The specification of the layout type needs to make clear how the
once the metadata server determines that the layout is revoked. client, metadata server, and storage device act together to meet the
protocol requirements discussed previously. If the document does not
impose implementation requirements sufficient to ensure that these
semantic requirements are met, it is not appropriate for the working
group to allow the document to move forward.
(2) The security considerations of the new Layout Type. Some examples include:
While these could be envisioned as one section in that the fencing o If the metadata server does not have a means to invalidate a
issue might be the only security issue, it is recommended to deal stateid issued to the storage device to keep a particular client
with them separably. from accessing a specific file, then the layout type spefication
has to document how the metadata server is going to fence the
client from access to the file on that storage device.
The specification of the Layout Type should discuss how the client, o If the metadata server implements mandatory byte-range locking
metadata server, and storage device act together to meet the protocol when accessed directly by the client, it must do so when data is
requirements. I.e., if the storage device cannot enforce mandatory read or written using the designated storage protocol.
byte-range locks, then how can the metadata server and the client
interact with the layout to enforce those locks?
4. Implementations in Existing Layout Types 4. Specifications of Existing Layout Types
This section is not normative with regards to each of the presented
types. This document does not update the specification of either the
block layout type (see [RFC5663]) or the object layout type (see
[RFC5664]). Nor does it update Section 13 of [RFC5661], but rather
Section 12 of that document. In other words, it is the pNFS
requirements being updated, not the specification of the file layout
type.
4.1. File Layout Type 4.1. File Layout Type
Not surprisingly, the File Layout Type comes closest to the normal Because the storage protocol is a subset of NFSv4.1, the semantics of
semantics of NFSv4.1. In particular, the stateid used for I/O MUST the file layout type comes closest to the semantics of NFSv4.1 in the
have the same effect and be subject to the same validation on a data absence of pNFS. In particular, the stateid and principal used for I
server as it would if the I/O was being performed on the metadata /O MUST have the same effect and be subject to the same validation on
server itself in the absence of pNFS. a data server as it would if the I/O were being performed on the
metadata server itself. The same set of validations apply whether
pNFS is in effect or not.
And while for most implementations the storage devices can do the And while for most implementations the storage devices can do the
following validations: following validations:
o client holds a valid layout, (1) client holds a valid layout,
o client I/O matches the layout iomode, and, (2) client I/O matches the layout iomode, and,
o client does not go out of the byte ranges, (3) client does not go out of the byte ranges,
these are each presented as a "SHOULD" and not a "MUST". However, it these are each presented as a "SHOULD" and not a "MUST". Actually,
is just these layout specific checks that are optional, not the the first point is presented as both:
normal file access semantics. The storage devices MUST make all of
the required access checks on each READ or WRITE I/O as determined by
the NFSv4.1 protocol. If the metadata server would deny a READ or
WRITE operation on a file due to its ACL, mode attribute, open access
mode, open deny mode, mandatory byte-range lock state, or any other
attributes and state, the storage device MUST also deny the READ or
WRITE operation. And note that while the NFSv4.1 protocol does not
mandate export access checks based on the client's IP address, if the
metadata server implements such a policy, then that counts as such
state as outlined above.
As the data filehandle provided by the PUTFH operation and the "MUST": in Section 13.6 of [RFC5661]
stateid in the READ or WRITE operation are used to ensure that the
client has a valid layout for the I/O being performed, the client can "As described in Section 12.5.1, a client MUST NOT send an I/O to
be fenced off for access to a specific file via the invalidation of a data server for which it does not hold a valid layout; the data
either key. server MUST reject such an I/O."
"SHOULD": in Section 13.8 of [RFC5661]
"The iomode need not be checked by the data servers when clients
perform I/O. However, the data servers SHOULD still validate that
the client holds a valid layout and return an error if the client
does not."
However, it is just these layout specific checks that are optional,
not the normal file access semantics. The storage devices MUST make
all of the required access checks on each READ or WRITE I/O as
determined by the NFSv4.1 protocol. If the metadata server would
deny a READ or WRITE operation on a file due to its ACL, mode
attribute, open access mode, open deny mode, mandatory byte-range
lock state, or any other attributes and state, the storage device
MUST also deny the READ or WRITE operation. And note that while the
NFSv4.1 protocol does not mandate export access checks based on the
client's IP address, if the metadata server implements such a policy,
then that counts as such state as outlined above.
The data filehandle provided by the PUTFH operation to the data
server is sufficient to ensure that for the subsequent READ or WRITE
operation in the compound, that the client has a valid layout for the
I/O being performed.
Finally, the data server can check the stateid presented in the READ
or WRITE operation to see if that stateid has been rejected by the
metadata server such to cause the I/O to be fenced. Whilst it might
just be the open owner or lock owner on that client being fenced, the
client should take the NFS4ERR_BAD_STATEID error code to mean it has
been fenced from the file and contact the metadata server.
4.2. Block Layout Type 4.2. Block Layout Type
With the Block Layout Type, the storage devices are not guaranteed to With the block layout type, the storage devices are not guaranteed to
be able to enforce file-based security. Typically, storage area be able to enforce file-based security. Typically, storage area
network (SAN) disk arrays and SAN protocols provide access control network (SAN) disk arrays and SAN protocols provide access control
mechanisms (e.g., Logical Unit Number (LUN) mapping and/or masking), mechanisms (e.g., Logical Unit Number (LUN) mapping and/or masking),
which operate at the granularity of individual hosts, not individual which operate at the granularity of individual hosts, not individual
blocks. Access to block storage is logically at a lower layer of the blocks. Access to block storage is logically at a lower layer of the
I/O stack than NFSv4, and hence NFSv4 security is not directly I/O stack than NFSv4, and hence NFSv4 security is not directly
applicable to protocols that access such storage directly. As such, applicable to protocols that access such storage directly. As such,
Section 2.1 [RFC5663] specifies that:
[RFC5663] is very careful to define that in environments where pNFS "in environments where pNFS clients cannot be trusted to enforce
clients cannot be trusted to enforce such policies, pNFS Block Layout such policies, pNFS block layout types SHOULD NOT be used."
Types SHOULD NOT be used.
The implication here is that the security burden has shifted from the As a result of these granularity issues, the security burden has been
storage devices to the client. It is the responsibility of the shifted from the storage devices to the client. Those deploying
administrator doing the deployment to trust the client implementations of this layout type need to be sure that the client
implementation. However, this is not a new requirement when it comes implementation can be trusted This is not a new sort of requirement
to SAN protocols, the client is expected to provide block-based in the context of SAN protocols. In such environments, the client is
protection. expected to provide block-based protection.
This implication also extends to ACLs, locks, and layouts. The This shift of the burden also extends to locks and layouts. The
storage devices might not be able to enforce any of these and the storage devices are not able to enforce any of these and the burden
burden is pushed to the client to make the appropriate checks before is pushed to the client to make the appropriate checks before sending
sending I/O to the storage devices. As an example, if the metadata I/O to the storage devices. For example, the server may use a layout
server uses a layout iomode for reading to enforce a mandatory read- iomode only allowing reading to enforce a mandatory read-only lock,
only lock, then the client has to honor that intent by not sending In such cases, the client has to support that use by not sending
WRITEs to the storage devices. The basic issue here is that the WRITEs to the storage devices. The fundamental issue here is that
storage device can be treated as a local dumb disk such that once the the storage device is treated by this layout type as a local dumb
client has access to the storage device, it is able to perform either disk. Once the client has access to the storage device, it is able
READ or WRITE I/O to the entire storage device. The byte ranges in to perform both READ and WRITE I/O to the entire storage device. The
the layout, any locks, the layout iomode, etc, can only be enforced byte ranges in the layout, any locks, the layout iomode, etc, can
by the client. only be enforced by the client. Therefore, the client is required to
provide that enforcement.
While the Block Layout Type does support client fencing upon revoking In the context of fencing off of the client upon revocation of a
a layout, the above restrictions come into play again: the layout, these limitations come into play again, i.e., the granularity
granularity of the fencing can only be at the host/logical-unit of the fencing can only be at the host/logical-unit level. Thus, if
level. Thus, if one of a client's layouts is unilaterally revoked by one of a client's layouts is revoked by the server, it will
the server, it will effectively render useless *all* of the client's effectively revoke all of the client's layouts for files located on
layouts for files located on the storage units comprising the logical the storage units comprising the logical volume. This may extend to
volume. This may render useless the client's layouts for files in the client's layouts for files in other file systems. Clients need
other file systems. to be prepared for such revocations and reacquire layouts as needed.
4.3. Object Layout Type 4.3. Object Layout Type
The Object Layout Type focuses security checks to occur during the With the object layout type, security checks occur during the
allocation of the layout. The client will typically ask for a layout allocation of the layout. The client will typically ask for layouts
for each byte-range of either READ or READ/WRITE. At that time, the covering all of the file and may do so for either READ or READ/WRITE.
metadata server should verify permissions against the layout iomode, This enables it to do subsequent I/O operations without the need to
the outstanding locks, the file mode bits or ACLs, etc. As the obtain layouts for specific byte ranges. At that time, the metadata
client may be acting for multiple local users, it MUST authenticate server should verify permissions against the layout iomode, the file
and authorize the user by issuing respective OPEN and ACCESS calls to mode bits or ACLs, etc. As the client may be acting for multiple
the metadata server, similar to having NFSv4 data delegations. local users, it MUST authenticate and authorize the user by issuing
respective OPEN and ACCESS calls to the metadata server, similar to
having NFSv4 data delegations.
Upon successful authorization, inside the layout, the client receives Upon successful authorization, the client receives within the layout
a set of object capabilities allowing it I/O access to the specified a set of object capabilities allowing it I/O access to the specified
objects corresponding to the requested iomode. These capabilities objects corresponding to the requested iomode. These capabilities
are used to enforce access control at the storage devices. Whenever are used to enforce access control and locking semantics at the
the metadata server detects one of: storage devices. Whenever one of the following occur on the metadata
server:
o the permissions on the object change, o the permissions on the object change,
o a conflicting mandatory byte-range lock is granted, or o a conflicting mandatory byte-range lock is granted, or
o a layout is revoked and reassigned to another client, o a layout is revoked and reassigned to another client,
then it MUST change the capability version attribute on all objects then the metadate server MUST change the capability version attribute
comprising the file to implicitly invalidate any outstanding on all objects comprising the file to in order to invalidate any
capabilities before committing to one of these changes. outstanding capabilities before committing to one of these changes.
When the metadata server wishes to fence off a client to a particular When the metadata server wishes to fence off a client to a particular
object, then it can use the above approach to invalidate the object, then it can use the above approach to invalidate the
capability attribute on the given object. The client can be informed capability attribute on the given object. The client can be informed
via the storage device that the capability has been rejected and is via the storage device that the capability has been rejected and is
allowed to fetch a refreshed set of capabilities, i.e., re-acquire allowed to fetch a refreshed set of capabilities, i.e., re-acquire
the layout. the layout.
5. Summary 5. Summary
In the three published Layout Types, the burden of enforcing the In the three published layout types, the burden of enforcing the
security of NFSv4.1 can fall to either the storage devices (Files), security of NFSv4.1 can fall to either the storage devices (files),
the client (Blocks), or the metadata server (Objects). Such the client (blocks), or the metadata server (objects). Such choices
decisions seem to be forced by the native capabilities of the storage are conditioned by the native capabilities of the storage devices -
devices - if a real control protocol can be implemented, then the if a control protocol can be implemented, then the burden can be
burden can be shifted primarily to the storage devices. shifted primarily to the storage devices.
But as we have seen, the control protocol is actually a set of In the context of this document, we treat the control protocol as a
requirements. And as new Layout Types are published, the enclosing set of requirements. And as new layout types are published, the
documents minimally MUST address: defining documents MUST address:
(1) The fencing of clients after a layout is revoked. (1) The fencing of clients after a layout is revoked.
(2) The security implications of the native capabilities of the (2) The security implications of the native capabilities of the
storage devices with respect to the requirements of the NFSv4.1 storage devices with respect to the requirements of the NFSv4.1
security model. security model.
In addition, these defining documents need to make clear how other
semantic requirements of NFSv4.1 (e.g., locking) are met in the
context of the proposed layout type.
6. Security Considerations 6. Security Considerations
The metadata server MUST be able to fence off a client's access to a This section does not deal directly with security considerations for
file stored on a storage device. When it revokes the layout, the existing or new layout types. Instead, it provides a general
client's access MUST be terminated at the storage devices. framework for understating security-related issues within the pNFS
framework. Specific security considerations will be addressed in the
Security Considerations sections of documents specifying layout
types.
The layout type specification must ensure that only data accesses
consistent with the NFSV4.1 security model are allowed. It may do
this directly, by providing that appropriate checks be performed at
the time the access is performed. It may do it indirectly by
allowing the client or the storage device to be responsible for
making the appropriate checks. In the latter case, I/O access writes
are reflected in layouts and the layout type must provide a way to
prevent inappropriate access due to permissions changes between the
time a layout is granted and the time the access is performed.
The metadata server MUST be able to fence off a client's access to
the data file on a storage device. When it revokes the layout, the
client's access MUST be terminated at the storage devices. The
client the has the opportunity to re-acquire the layout and perform
the security check in the context of the newly current access
permissions.
7. IANA Considerations 7. IANA Considerations
This document has no actions for IANA. This document has no actions for IANA.
8. References 8. References
8.1. Normative References 8.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
skipping to change at page 10, line 30 skipping to change at page 15, line 26
[RFC5663] Black, D., Fridella, S., and J. Glasgow, "pNFS Block/ [RFC5663] Black, D., Fridella, S., and J. Glasgow, "pNFS Block/
Volume Layout", RFC 5663, January 2010. Volume Layout", RFC 5663, January 2010.
[RFC5664] Halevy, B., Welch, B., and J. Zelenka, "Object-Based [RFC5664] Halevy, B., Welch, B., and J. Zelenka, "Object-Based
Parallel NFS (pNFS) Operations", RFC 5664, January 2010. Parallel NFS (pNFS) Operations", RFC 5664, January 2010.
8.2. Informative References 8.2. Informative References
[FlexFiles] [FlexFiles]
Halevy, B. and T. Haynes, "Parallel NFS (pNFS) Flexible Halevy, B. and T. Haynes, "Parallel NFS (pNFS) Flexible
File Layout", draft-ietf-nfsv4-flex-files-02 (Work In File Layout", draft-ietf-nfsv4-flex-files-11 (Work In
Progress), October 2014. Progress), July 2017.
[Lustre] Faibish, S. and P. Tao, "Parallel NFS (pNFS) Lustre Layout [Lustre] Faibish, S. and P. Tao, "Parallel NFS (pNFS) Lustre Layout
Operations", draft-faibish-nfsv4-pnfs-lustre-layout-07 Operations", draft-faibish-nfsv4-pnfs-lustre-layout-07
(Work In Progress), April 2014. (Work In Progress), April 2014.
Appendix A. Acknowledgments Appendix A. Acknowledgments
Dave Noveck provided an early review that sharpened the clarity of Dave Noveck provided an early review that sharpened the clarity of
the definitions. the definitions. He also provided a more comprehensive review of the
document.
Appendix B. RFC Editor Notes Appendix B. RFC Editor Notes
[RFC Editor: please remove this section prior to publishing this [RFC Editor: please remove this section prior to publishing this
document as an RFC] document as an RFC]
[RFC Editor: prior to publishing this document as an RFC, please [RFC Editor: prior to publishing this document as an RFC, please
replace all occurrences of RFCTBD10 with RFCxxxx where xxxx is the replace all occurrences of RFCTBD10 with RFCxxxx where xxxx is the
RFC number of this document] RFC number of this document]
 End of changes. 71 change blocks. 
248 lines changed or deleted 481 lines changed or added

This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/