draft-ietf-nfsv4-flex-files-14.txt   draft-ietf-nfsv4-flex-files-15.txt 
NFSv4 B. Halevy NFSv4 B. Halevy
Internet-Draft Internet-Draft
Intended status: Standards Track T. Haynes Intended status: Standards Track T. Haynes
Expires: March 9, 2018 Primary Data Expires: May 24, 2018 Primary Data
September 05, 2017 November 20, 2017
Parallel NFS (pNFS) Flexible File Layout Parallel NFS (pNFS) Flexible File Layout
draft-ietf-nfsv4-flex-files-14.txt draft-ietf-nfsv4-flex-files-15.txt
Abstract Abstract
The Parallel Network File System (pNFS) allows a separation between The Parallel Network File System (pNFS) allows a separation between
the metadata (onto a metadata server) and data (onto a storage the metadata (onto a metadata server) and data (onto a storage
device) for a file. The flexible file layout type is defined in this device) for a file. The flexible file layout type is defined in this
document as an extension to pNFS which allows the use of storage document as an extension to pNFS which allows the use of storage
devices in a fashion such that they require only a quite limited devices in a fashion such that they require only a quite limited
degree of interaction with the metadata server, using already degree of interaction with the metadata server, using already
existing protocols. Client-side mirroring is also added to provide existing protocols. Client-side mirroring is also added to provide
skipping to change at page 1, line 38 skipping to change at page 1, line 38
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on March 9, 2018. This Internet-Draft will expire on May 24, 2018.
Copyright Notice Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 31 skipping to change at page 2, line 31
2.3.2. Tightly Coupled Locking Model . . . . . . . . . . . . 11 2.3.2. Tightly Coupled Locking Model . . . . . . . . . . . . 11
3. XDR Description of the Flexible File Layout Type . . . . . . 13 3. XDR Description of the Flexible File Layout Type . . . . . . 13
3.1. Code Components Licensing Notice . . . . . . . . . . . . 13 3.1. Code Components Licensing Notice . . . . . . . . . . . . 13
4. Device Addressing and Discovery . . . . . . . . . . . . . . . 15 4. Device Addressing and Discovery . . . . . . . . . . . . . . . 15
4.1. ff_device_addr4 . . . . . . . . . . . . . . . . . . . . . 15 4.1. ff_device_addr4 . . . . . . . . . . . . . . . . . . . . . 15
4.2. Storage Device Multipathing . . . . . . . . . . . . . . . 16 4.2. Storage Device Multipathing . . . . . . . . . . . . . . . 16
5. Flexible File Layout Type . . . . . . . . . . . . . . . . . . 17 5. Flexible File Layout Type . . . . . . . . . . . . . . . . . . 17
5.1. ff_layout4 . . . . . . . . . . . . . . . . . . . . . . . 18 5.1. ff_layout4 . . . . . . . . . . . . . . . . . . . . . . . 18
5.1.1. Error Codes from LAYOUTGET . . . . . . . . . . . . . 22 5.1.1. Error Codes from LAYOUTGET . . . . . . . . . . . . . 22
5.1.2. Client Interactions with FF_FLAGS_NO_IO_THRU_MDS . . 22 5.1.2. Client Interactions with FF_FLAGS_NO_IO_THRU_MDS . . 22
5.2. Interactions Between Devices and Layouts . . . . . . . . 22 5.2. LAYOUTCOMMIT . . . . . . . . . . . . . . . . . . . . . . 22
5.3. Handling Version Errors . . . . . . . . . . . . . . . . . 23 5.3. Interactions Between Devices and Layouts . . . . . . . . 23
5.4. Handling Version Errors . . . . . . . . . . . . . . . . . 23
6. Striping via Sparse Mapping . . . . . . . . . . . . . . . . . 23 6. Striping via Sparse Mapping . . . . . . . . . . . . . . . . . 23
7. Recovering from Client I/O Errors . . . . . . . . . . . . . . 24 7. Recovering from Client I/O Errors . . . . . . . . . . . . . . 24
8. Mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . 24 8. Mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . 25
8.1. Selecting a Mirror . . . . . . . . . . . . . . . . . . . 25 8.1. Selecting a Mirror . . . . . . . . . . . . . . . . . . . 25
8.2. Writing to Mirrors . . . . . . . . . . . . . . . . . . . 26 8.2. Writing to Mirrors . . . . . . . . . . . . . . . . . . . 26
8.2.1. Single Storage Device Updates Mirrors . . . . . . . . 26 8.2.1. Single Storage Device Updates Mirrors . . . . . . . . 26
8.2.2. Single Storage Device Updates Mirrors . . . . . . . . 26 8.2.2. Single Storage Device Updates Mirrors . . . . . . . . 26
8.2.3. Handling Write Errors . . . . . . . . . . . . . . . . 26 8.2.3. Handling Write Errors . . . . . . . . . . . . . . . . 26
8.2.4. Handling Write COMMITs . . . . . . . . . . . . . . . 27 8.2.4. Handling Write COMMITs . . . . . . . . . . . . . . . 27
8.3. Metadata Server Resilvering of the File . . . . . . . . . 27 8.3. Metadata Server Resilvering of the File . . . . . . . . . 28
9. Flexible Files Layout Type Return . . . . . . . . . . . . . . 28 9. Flexible Files Layout Type Return . . . . . . . . . . . . . . 28
9.1. I/O Error Reporting . . . . . . . . . . . . . . . . . . . 29 9.1. I/O Error Reporting . . . . . . . . . . . . . . . . . . . 29
9.1.1. ff_ioerr4 . . . . . . . . . . . . . . . . . . . . . . 29 9.1.1. ff_ioerr4 . . . . . . . . . . . . . . . . . . . . . . 29
9.2. Layout Usage Statistics . . . . . . . . . . . . . . . . . 30 9.2. Layout Usage Statistics . . . . . . . . . . . . . . . . . 30
9.2.1. ff_io_latency4 . . . . . . . . . . . . . . . . . . . 30 9.2.1. ff_io_latency4 . . . . . . . . . . . . . . . . . . . 30
9.2.2. ff_layoutupdate4 . . . . . . . . . . . . . . . . . . 31 9.2.2. ff_layoutupdate4 . . . . . . . . . . . . . . . . . . 31
9.2.3. ff_iostats4 . . . . . . . . . . . . . . . . . . . . . 31 9.2.3. ff_iostats4 . . . . . . . . . . . . . . . . . . . . . 31
9.3. ff_layoutreturn4 . . . . . . . . . . . . . . . . . . . . 32 9.3. ff_layoutreturn4 . . . . . . . . . . . . . . . . . . . . 33
10. Flexible Files Layout Type LAYOUTERROR . . . . . . . . . . . 33 10. Flexible Files Layout Type LAYOUTERROR . . . . . . . . . . . 33
11. Flexible Files Layout Type LAYOUTSTATS . . . . . . . . . . . 33 11. Flexible Files Layout Type LAYOUTSTATS . . . . . . . . . . . 33
12. Flexible File Layout Type Creation Hint . . . . . . . . . . . 33 12. Flexible File Layout Type Creation Hint . . . . . . . . . . . 34
12.1. ff_layouthint4 . . . . . . . . . . . . . . . . . . . . . 34 12.1. ff_layouthint4 . . . . . . . . . . . . . . . . . . . . . 34
13. Recalling a Layout . . . . . . . . . . . . . . . . . . . . . 34 13. Recalling a Layout . . . . . . . . . . . . . . . . . . . . . 35
13.1. CB_RECALL_ANY . . . . . . . . . . . . . . . . . . . . . 35 13.1. CB_RECALL_ANY . . . . . . . . . . . . . . . . . . . . . 35
14. Client Fencing . . . . . . . . . . . . . . . . . . . . . . . 36 14. Client Fencing . . . . . . . . . . . . . . . . . . . . . . . 36
15. Security Considerations . . . . . . . . . . . . . . . . . . . 36 15. Security Considerations . . . . . . . . . . . . . . . . . . . 36
15.1. RPCSEC_GSS and Security Services . . . . . . . . . . . . 37 15.1. RPCSEC_GSS and Security Services . . . . . . . . . . . . 37
15.1.1. Loosely Coupled . . . . . . . . . . . . . . . . . . 37 15.1.1. Loosely Coupled . . . . . . . . . . . . . . . . . . 37
15.1.2. Tightly Coupled . . . . . . . . . . . . . . . . . . 37 15.1.2. Tightly Coupled . . . . . . . . . . . . . . . . . . 38
16. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 37 16. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 38
17. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 17. References . . . . . . . . . . . . . . . . . . . . . . . . . 39
17.1. Normative References . . . . . . . . . . . . . . . . . . 38 17.1. Normative References . . . . . . . . . . . . . . . . . . 39
17.2. Informative References . . . . . . . . . . . . . . . . . 39 17.2. Informative References . . . . . . . . . . . . . . . . . 40
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 39 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 40
Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 40 Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 41
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 40 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41
1. Introduction 1. Introduction
In the parallel Network File System (pNFS), the metadata server In the parallel Network File System (pNFS), the metadata server
returns layout type structures that describe where file data is returns layout type structures that describe where file data is
located. There are different layout types for different storage located. There are different layout types for different storage
systems and methods of arranging data on storage devices. This systems and methods of arranging data on storage devices. This
document defines the flexible file layout type used with file-based document defines the flexible file layout type used with file-based
data servers that are accessed using the Network File System (NFS) data servers that are accessed using the Network File System (NFS)
protocols: NFSv3 [RFC1813], NFSv4.0 [RFC7530], NFSv4.1 [RFC5661], and protocols: NFSv3 [RFC1813], NFSv4.0 [RFC7530], NFSv4.1 [RFC5661], and
NFSv4.2 [RFC7862]. NFSv4.2 [RFC7862].
To provide a global state model equivalent to that of the files To provide a global state model equivalent to that of the files
layout type, a back-end control protocol might be implemented between layout type, a back-end control protocol might be implemented between
the metadata server and NFSv4.1+ storage devices. This document does the metadata server and NFSv4.1+ storage devices. This document does
not provide a standard's track control protocol. An implementation not provide a standard track control protocol. An implementation can
can either define its own mechanism or it could define a control either define its own mechanism or it could define a control protocol
protocol in a standard's track document. The requirements for the a in a standard's track document. The requirements for a control
control protocol are specified in [RFC5661] and clarified in protocol are specified in [RFC5661] and clarified in [pNFSLayouts].
[pNFSLayouts].
1.1. Definitions 1.1. Definitions
control communication requirements: are for a layout type the control communication requirements: are for a layout type the
details regarding information on layouts, stateids, file metadata, details regarding information on layouts, stateids, file metadata,
and file data which must be communicated between the metadata and file data which must be communicated between the metadata
server and the storage devices. server and the storage devices.
control protocol: is the particular mechanism that an implementation control protocol: is the particular mechanism that an implementation
of a layout type would use to meet the control communication of a layout type would use to meet the control communication
skipping to change at page 16, line 15 skipping to change at page 16, line 15
NFSv4 minor version numbers and the client MUST access the storage NFSv4 minor version numbers and the client MUST access the storage
device using NFSv4 with the specified minor version. device using NFSv4 with the specified minor version.
Note that while the client might determine that it cannot use any of Note that while the client might determine that it cannot use any of
the configured combinations of ffdv_version, ffdv_minorversion, and the configured combinations of ffdv_version, ffdv_minorversion, and
ffdv_tightly_coupled, when it gets the device list from the metadata ffdv_tightly_coupled, when it gets the device list from the metadata
server, there is no way to indicate to the metadata server as to server, there is no way to indicate to the metadata server as to
which device it is version incompatible. If however, the client which device it is version incompatible. If however, the client
waits until it retrieves the layout from the metadata server, it can waits until it retrieves the layout from the metadata server, it can
at that time clearly identify the storage device in question (see at that time clearly identify the storage device in question (see
Section 5.3). Section 5.4).
The ffdv_rsize and ffdv_wsize are used to communicate the maximum The ffdv_rsize and ffdv_wsize are used to communicate the maximum
rsize and wsize supported by the storage device. As the storage rsize and wsize supported by the storage device. As the storage
device can have a different rsize or wsize than the metadata server, device can have a different rsize or wsize than the metadata server,
the ffdv_rsize and ffdv_wsize allow the metadata server to the ffdv_rsize and ffdv_wsize allow the metadata server to
communicate that information on behalf of the storage device. communicate that information on behalf of the storage device.
ffdv_tightly_coupled informs the client as to whether the metadata ffdv_tightly_coupled informs the client as to whether the metadata
server is tightly coupled with the storage devices or not. Note that server is tightly coupled with the storage devices or not. Note that
even if the data protocol is at least NFSv4.1, it may still be the even if the data protocol is at least NFSv4.1, it may still be the
skipping to change at page 17, line 5 skipping to change at page 17, line 5
To support storage device multipathing, ffda_netaddrs contains an To support storage device multipathing, ffda_netaddrs contains an
array of one or more storage device network addresses. This array array of one or more storage device network addresses. This array
(data type multipath_list4) represents a list of storage devices (data type multipath_list4) represents a list of storage devices
(each identified by a network address), with the possibility that (each identified by a network address), with the possibility that
some storage device will appear in the list multiple times. some storage device will appear in the list multiple times.
The client is free to use any of the network addresses as a The client is free to use any of the network addresses as a
destination to send storage device requests. If some network destination to send storage device requests. If some network
addresses are less desirable paths to the data than others, then the addresses are less desirable paths to the data than others, then the
MDS SHOULD NOT include those network addresses in ffda_netaddrs. If metadata server SHOULD NOT include those network addresses in
less desirable network addresses exist to provide failover, the ffda_netaddrs. If less desirable network addresses exist to provide
RECOMMENDED method to offer the addresses is to provide them in a failover, the RECOMMENDED method to offer the addresses is to provide
replacement device-ID-to-device-address mapping, or a replacement them in a replacement device-ID-to-device-address mapping, or a
device ID. When a client finds no response from the storage device replacement device ID. When a client finds no response from the
using all addresses available in ffda_netaddrs, it SHOULD send a storage device using all addresses available in ffda_netaddrs, it
GETDEVICEINFO to attempt to replace the existing device-ID-to-device- SHOULD send a GETDEVICEINFO to attempt to replace the existing
address mappings. If the MDS detects that all network paths device-ID-to-device-address mappings. If the metadata server detects
represented by ffda_netaddrs are unavailable, the MDS SHOULD send a that all network paths represented by ffda_netaddrs are unavailable,
CB_NOTIFY_DEVICEID (if the client has indicated it wants device ID the metadata server SHOULD send a CB_NOTIFY_DEVICEID (if the client
notifications for changed device IDs) to change the device-ID-to- has indicated it wants device ID notifications for changed device
device-address mappings to the available addresses. If the device ID IDs) to change the device-ID-to-device-address mappings to the
itself will be replaced, the MDS SHOULD recall all layouts with the available addresses. If the device ID itself will be replaced, the
device ID, and thus force the client to get new layouts and device ID metadata server SHOULD recall all layouts with the device ID, and
mappings via LAYOUTGET and GETDEVICEINFO. thus force the client to get new layouts and device ID mappings via
LAYOUTGET and GETDEVICEINFO.
Generally, if two network addresses appear in ffda_netaddrs, they Generally, if two network addresses appear in ffda_netaddrs, they
will designate the same storage device. When the storage device is will designate the same storage device. When the storage device is
accessed over NFSv4.1 or a higher minor version, the two storage accessed over NFSv4.1 or a higher minor version, the two storage
device addresses will support the implementation of client ID or device addresses will support the implementation of client ID or
session trunking (the latter is RECOMMENDED) as defined in [RFC5661]. session trunking (the latter is RECOMMENDED) as defined in [RFC5661].
The two storage device addresses will share the same server owner or The two storage device addresses will share the same server owner or
major ID of the server owner. It is not always necessary for the two major ID of the server owner. It is not always necessary for the two
storage device addresses to designate the same storage device with storage device addresses to designate the same storage device with
trunking being used. For example, the data could be read-only, and trunking being used. For example, the data could be read-only, and
skipping to change at page 20, line 43 skipping to change at page 20, line 43
data file. data file.
ffds_fh_vers is an array of filehandles of the data file matching to ffds_fh_vers is an array of filehandles of the data file matching to
the available NFS versions on the given storage device. There MUST the available NFS versions on the given storage device. There MUST
be exactly as many elements in ffds_fh_vers as there are in be exactly as many elements in ffds_fh_vers as there are in
ffda_versions. Each element of the array corresponds to a particular ffda_versions. Each element of the array corresponds to a particular
combination of ffdv_version, ffdv_minorversion, and combination of ffdv_version, ffdv_minorversion, and
ffdv_tightly_coupled provided for the device. The array allows for ffdv_tightly_coupled provided for the device. The array allows for
server implementations which have different filehandles for different server implementations which have different filehandles for different
combinations of version, minor version, and coupling strength. See combinations of version, minor version, and coupling strength. See
Section 5.3 for how to handle versioning issues between the client Section 5.4 for how to handle versioning issues between the client
and storage devices. and storage devices.
For tight coupling, ffds_stateid provides the stateid to be used by For tight coupling, ffds_stateid provides the stateid to be used by
the client to access the file. For loose coupling and a NFSv4 the client to access the file. For loose coupling and a NFSv4
storage device, the client will have to use an anonymous stateid to storage device, the client will have to use an anonymous stateid to
perform I/O on the storage device. With no control protocol, the perform I/O on the storage device. With no control protocol, the
metadata server stateid can not be used to provide a global stateid metadata server stateid can not be used to provide a global stateid
model. Thus the server MUST set the ffds_stateid to be the anonymous model. Thus the server MUST set the ffds_stateid to be the anonymous
stateid. stateid.
skipping to change at page 22, line 46 skipping to change at page 22, line 46
should not continue with I/O to the storage devices. should not continue with I/O to the storage devices.
5.1.2. Client Interactions with FF_FLAGS_NO_IO_THRU_MDS 5.1.2. Client Interactions with FF_FLAGS_NO_IO_THRU_MDS
Even if the metadata server provides the FF_FLAGS_NO_IO_THRU_MDS, Even if the metadata server provides the FF_FLAGS_NO_IO_THRU_MDS,
flag, the client can still perform I/O to the metadata server. The flag, the client can still perform I/O to the metadata server. The
flag functions as a hint. The flag indicates to the client that the flag functions as a hint. The flag indicates to the client that the
metadata server prefers to separate the metadata I/O from the data I/ metadata server prefers to separate the metadata I/O from the data I/
O, most likely for peformance reasons. O, most likely for peformance reasons.
5.2. Interactions Between Devices and Layouts 5.2. LAYOUTCOMMIT
The flex file layout does not use lou_body. If lou_type is
LAYOUT4_FLEX_FILES, the lou_body field MUST have a zero length.
5.3. Interactions Between Devices and Layouts
In [RFC5661], the file layout type is defined such that the In [RFC5661], the file layout type is defined such that the
relationship between multipathing and filehandles can result in relationship between multipathing and filehandles can result in
either 0, 1, or N filehandles (see Section 13.3). Some rationals for either 0, 1, or N filehandles (see Section 13.3). Some rationales
this are clustered servers which share the same filehandle or for this are clustered servers which share the same filehandle or
allowing for multiple read-only copies of the file on the same allowing for multiple read-only copies of the file on the same
storage device. In the flexible file layout type, while there is an storage device. In the flexible file layout type, while there is an
array of filehandles, they are independent of the multipathing being array of filehandles, they are independent of the multipathing being
used. If the metadata server wants to provide multiple read-only used. If the metadata server wants to provide multiple read-only
copies of the same file on the same storage device, then it should copies of the same file on the same storage device, then it should
provide multiple ff_device_addr4, each as a mirror. The client can provide multiple ff_device_addr4, each as a mirror. The client can
then determine that since the ffds_fh_vers are different, then there then determine that since the ffds_fh_vers are different, then there
are multiple copies of the file for the current layout segment are multiple copies of the file for the current layout segment
available. available.
5.3. Handling Version Errors 5.4. Handling Version Errors
When the metadata server provides the ffda_versions array in the When the metadata server provides the ffda_versions array in the
ff_device_addr4 (see Section 4.1), the client is able to determine if ff_device_addr4 (see Section 4.1), the client is able to determine if
it can not access a storage device with any of the supplied it can not access a storage device with any of the supplied
combinations of ffdv_version, ffdv_minorversion, and combinations of ffdv_version, ffdv_minorversion, and
ffdv_tightly_coupled. However, due to the limitations of reporting ffdv_tightly_coupled. However, due to the limitations of reporting
errors in GETDEVICEINFO (see Section 18.40 in [RFC5661], the client errors in GETDEVICEINFO (see Section 18.40 in [RFC5661], the client
is not able to specify which specific device it can not communicate is not able to specify which specific device it can not communicate
with over one of the provided ffdv_version and ffdv_minorversion with over one of the provided ffdv_version and ffdv_minorversion
combinations. Using ff_ioerr4 (see Section 9.1.1 inside either the combinations. Using ff_ioerr4 (see Section 9.1.1 inside either the
skipping to change at page 24, line 29 skipping to change at page 24, line 34
storage devices. However, it is the responsibility of the metadata storage devices. However, it is the responsibility of the metadata
server to recover from the I/O errors. When the LAYOUT4_FLEX_FILES server to recover from the I/O errors. When the LAYOUT4_FLEX_FILES
layout type is used, the client MUST report the I/O errors to the layout type is used, the client MUST report the I/O errors to the
server at LAYOUTRETURN time using the ff_ioerr4 structure (see server at LAYOUTRETURN time using the ff_ioerr4 structure (see
Section 9.1.1). Section 9.1.1).
The metadata server analyzes the error and determines the required The metadata server analyzes the error and determines the required
recovery operations such as recovering media failures or recovery operations such as recovering media failures or
reconstructing missing data files. reconstructing missing data files.
The metadata server SHOULD recall any outstanding layouts to allow it The metadata server MUST recall any outstanding layouts to allow it
exclusive write access to the stripes being recovered and to prevent exclusive write access to the stripes being recovered and to prevent
other clients from hitting the same error condition. In these cases, other clients from hitting the same error condition. In these cases,
the server MUST complete recovery before handing out any new layouts the server MUST complete recovery before handing out any new layouts
to the affected byte ranges. to the affected byte ranges.
Although the client implementation has the option to propagate a Although the client implementation has the option to propagate a
corresponding error to the application that initiated the I/O corresponding error to the application that initiated the I/O
operation and drop any unwritten data, the client should attempt to operation and drop any unwritten data, the client should attempt to
retry the original I/O operation by either requesting a new layout or retry the original I/O operation by either requesting a new layout or
sending the I/O via regular NFSv4.1+ READ or WRITE operations to the sending the I/O via regular NFSv4.1+ READ or WRITE operations to the
skipping to change at page 36, line 9 skipping to change at page 36, line 27
The PNFS_FF_RCA4_TYPE_MASK_READ flag notifies the client to return The PNFS_FF_RCA4_TYPE_MASK_READ flag notifies the client to return
layouts of iomode LAYOUTIOMODE4_READ. Similarly, the layouts of iomode LAYOUTIOMODE4_READ. Similarly, the
PNFS_FF_RCA4_TYPE_MASK_RW flag notifies the client to return layouts PNFS_FF_RCA4_TYPE_MASK_RW flag notifies the client to return layouts
of iomode LAYOUTIOMODE4_RW. When both mask flags are set, the client of iomode LAYOUTIOMODE4_RW. When both mask flags are set, the client
is notified to return layouts of either iomode. is notified to return layouts of either iomode.
14. Client Fencing 14. Client Fencing
In cases where clients are uncommunicative and their lease has In cases where clients are uncommunicative and their lease has
expired or when clients fail to return recalled layouts within a expired or when clients fail to return recalled layouts within a
lease period, at the least the server MAY revoke client layouts and lease period, the server MAY revoke client layouts and reassign these
reassign these resources to other clients (see Section 12.5.5 in resources to other clients (see Section 12.5.5 in [RFC5661]). To
[RFC5661]). To avoid data corruption, the metadata server MUST fence avoid data corruption, the metadata server MUST fence off the revoked
off the revoked clients from the respective data files as described clients from the respective data files as described in Section 2.2.
in Section 2.2.
15. Security Considerations 15. Security Considerations
The pNFS extension partitions the NFSv4.1+ file system protocol into The pNFS feature partitions the NFSv4.1+ file system protocol into
two parts, the control path and the data path (storage protocol). two parts, the control path and the data path (storage protocol).
The control path contains all the new operations described by this The control path contains all the new operations described by this
extension; all existing NFSv4 security mechanisms and features apply feature; all existing NFSv4 security mechanisms and features apply to
to the control path (see Sections 1.7.1 and 2.2.1 of [RFC5661]). The the control path (see Sections 1.7.1 and 2.2.1 of [RFC5661]). The
combination of components in a pNFS system is required to preserve combination of components in a pNFS system is required to preserve
the security properties of NFSv4.1+ with respect to an entity the security properties of NFSv4.1+ with respect to an entity
accessing data via a client, including security countermeasures to accessing data via a client, including security countermeasures to
defend against threats that NFSv4.1+ provides defenses for in defend against threats that NFSv4.1+ provides defenses for in
environments where these threats are considered significant. environments where these threats are considered significant.
The metadata server is primarily responsible for securing the data
path. It has to authenticate the client access and provide
appropriate credentials to the client to access data files on the
storage device. Finally, it is responsible for revoking access for a
client to the storage device.
The metadata server enforces the file access-control policy at The metadata server enforces the file access-control policy at
LAYOUTGET time. The client should use RPC authorization credentials LAYOUTGET time. The client should use RPC authorization credentials
for getting the layout for the requested iomode (READ or RW) and the for getting the layout for the requested iomode (READ or RW) and the
server verifies the permissions and ACL for these credentials, server verifies the permissions and ACL for these credentials,
possibly returning NFS4ERR_ACCESS if the client is not allowed the possibly returning NFS4ERR_ACCESS if the client is not allowed the
requested iomode. If the LAYOUTGET operation succeeds the client requested iomode. If the LAYOUTGET operation succeeds the client
receives, as part of the layout, a set of credentials allowing it I/O receives, as part of the layout, a set of credentials allowing it I/O
access to the specified data files corresponding to the requested access to the specified data files corresponding to the requested
iomode. When the client acts on I/O operations on behalf of its iomode. When the client acts on I/O operations on behalf of its
local users, it MUST authenticate and authorize the user by issuing local users, it MUST authenticate and authorize the user by issuing
skipping to change at page 37, line 7 skipping to change at page 37, line 33
for the respective files by implicitly invalidating the previously for the respective files by implicitly invalidating the previously
distributed credential on all data file comprising the file in distributed credential on all data file comprising the file in
question. It is REQUIRED that this be done before committing to the question. It is REQUIRED that this be done before committing to the
new permissions and/or ACL. By requesting new layouts, the clients new permissions and/or ACL. By requesting new layouts, the clients
will reauthorize access against the modified access control metadata. will reauthorize access against the modified access control metadata.
Recalling the layouts in this case is intended to prevent clients Recalling the layouts in this case is intended to prevent clients
from getting an error on I/Os done after the client was fenced off. from getting an error on I/Os done after the client was fenced off.
15.1. RPCSEC_GSS and Security Services 15.1. RPCSEC_GSS and Security Services
Because of the special use of principals within the loose coupling
model, the issues are different depending on the coupling model.
15.1.1. Loosely Coupled 15.1.1. Loosely Coupled
RPCSEC_GSS version 3 (RPCSEC_GSSv3) [RFC7861] could be used to RPCSEC_GSS version 3 (RPCSEC_GSSv3) [RFC7861] contains facilities
authorize the client to the storage device on behalf of the metadata that would allow it to be used to authorize the client to the storage
server. This would require that each of the metadata server, storage device on behalf of the metadata server. Doing so would require that
device, and client would have to implement RPCSEC_GSSv3 via an RPC- each of the metadata server, storage device, and client would need to
application-defined structured privilege assertion in a manner implement RPCSEC_GSSv3 using an RPC-application-defined structured
described in Section 4.9.1 of [RFC7862]. These requirements do not privilege assertion in a manner described in Section 4.9.1 of
match the intent of the loosely coupled model that the storage device [RFC7862]. The specifics necessary to do so are not described in
need not be modified. (Note that this does not preclude the use of this document. This is principally because any such specification
RPCSEC_GSSv3 in a loosely coupled model.) would require extensive implementation work on a wide range of
storage devices, which would be unlikely to result in a widely usable
specification for a considerable time.
As a result, the layout type described in this document will not
provide support for use of RPCSEC_GSS together with the loosely
coupled model. However, future layout types could be specified which
would allow such support, either through the use of RPCSEC_GSSv3, or
in other ways.
15.1.2. Tightly Coupled 15.1.2. Tightly Coupled
With tight coupling, the principal used to access the metadata file With tight coupling, the principal used to access the metadata file
is exactly the same as used to access the data file. The storage is exactly the same as used to access the data file. The storage
device can use the control protocol to validate any RPC credentials. device can use the control protocol to validate any RPC credentials.
As a result there are no security issues related to using RPCSEC_GSS As a result there are no security issues related to using RPCSEC_GSS
with a tightly coupled system. For example, if Kerberos V5 GSS-API with a tightly coupled system. For example, if Kerberos V5 GSS-API
[RFC4121] is used as the security mechanism, then the storage device [RFC4121] is used as the security mechanism, then the storage device
could use a control protocol to validate the RPC credentials to the could use a control protocol to validate the RPC credentials to the
 End of changes. 24 change blocks. 
63 lines changed or deleted 86 lines changed or added

This html diff was produced by rfcdiff 1.46. The latest version is available from http://tools.ietf.org/tools/rfcdiff/