draft-ietf-nfsv4-flex-files-15.txt   draft-ietf-nfsv4-flex-files-16.txt 
NFSv4 B. Halevy NFSv4 B. Halevy
Internet-Draft Internet-Draft
Intended status: Standards Track T. Haynes Intended status: Standards Track T. Haynes
Expires: May 24, 2018 Primary Data Expires: July 29, 2018 Primary Data
November 20, 2017 January 25, 2018
Parallel NFS (pNFS) Flexible File Layout Parallel NFS (pNFS) Flexible File Layout
draft-ietf-nfsv4-flex-files-15.txt draft-ietf-nfsv4-flex-files-16.txt
Abstract Abstract
The Parallel Network File System (pNFS) allows a separation between The Parallel Network File System (pNFS) allows a separation between
the metadata (onto a metadata server) and data (onto a storage the metadata (onto a metadata server) and data (onto a storage
device) for a file. The flexible file layout type is defined in this device) for a file. The flexible file layout type is defined in this
document as an extension to pNFS which allows the use of storage document as an extension to pNFS which allows the use of storage
devices in a fashion such that they require only a quite limited devices in a fashion such that they require only a quite limited
degree of interaction with the metadata server, using already degree of interaction with the metadata server, using already
existing protocols. Client-side mirroring is also added to provide existing protocols. Client-side mirroring is also added to provide
skipping to change at page 1, line 38 skipping to change at page 1, line 38
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 24, 2018. This Internet-Draft will expire on July 29, 2018.
Copyright Notice Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 4
1.2. Requirements Language . . . . . . . . . . . . . . . . . . 6 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 6
2. Coupling of Storage Devices . . . . . . . . . . . . . . . . . 6 2. Coupling of Storage Devices . . . . . . . . . . . . . . . . . 6
2.1. LAYOUTCOMMIT . . . . . . . . . . . . . . . . . . . . . . 6 2.1. LAYOUTCOMMIT . . . . . . . . . . . . . . . . . . . . . . 7
2.2. Fencing Clients from the Storage Device . . . . . . . . . 6 2.2. Fencing Clients from the Storage Device . . . . . . . . . 7
2.2.1. Implementation Notes for Synthetic uids/gids . . . . 8 2.2.1. Implementation Notes for Synthetic uids/gids . . . . 8
2.2.2. Example of using Synthetic uids/gids . . . . . . . . 8 2.2.2. Example of using Synthetic uids/gids . . . . . . . . 9
2.3. State and Locking Models . . . . . . . . . . . . . . . . 9 2.3. State and Locking Models . . . . . . . . . . . . . . . . 10
2.3.1. Loosely Coupled Locking Model . . . . . . . . . . . . 10 2.3.1. Loosely Coupled Locking Model . . . . . . . . . . . . 10
2.3.2. Tightly Coupled Locking Model . . . . . . . . . . . . 11 2.3.2. Tightly Coupled Locking Model . . . . . . . . . . . . 12
3. XDR Description of the Flexible File Layout Type . . . . . . 13 3. XDR Description of the Flexible File Layout Type . . . . . . 13
3.1. Code Components Licensing Notice . . . . . . . . . . . . 13 3.1. Code Components Licensing Notice . . . . . . . . . . . . 14
4. Device Addressing and Discovery . . . . . . . . . . . . . . . 15 4. Device Addressing and Discovery . . . . . . . . . . . . . . . 15
4.1. ff_device_addr4 . . . . . . . . . . . . . . . . . . . . . 15 4.1. ff_device_addr4 . . . . . . . . . . . . . . . . . . . . . 15
4.2. Storage Device Multipathing . . . . . . . . . . . . . . . 16 4.2. Storage Device Multipathing . . . . . . . . . . . . . . . 17
5. Flexible File Layout Type . . . . . . . . . . . . . . . . . . 17 5. Flexible File Layout Type . . . . . . . . . . . . . . . . . . 18
5.1. ff_layout4 . . . . . . . . . . . . . . . . . . . . . . . 18 5.1. ff_layout4 . . . . . . . . . . . . . . . . . . . . . . . 19
5.1.1. Error Codes from LAYOUTGET . . . . . . . . . . . . . 22 5.1.1. Error Codes from LAYOUTGET . . . . . . . . . . . . . 22
5.1.2. Client Interactions with FF_FLAGS_NO_IO_THRU_MDS . . 22 5.1.2. Client Interactions with FF_FLAGS_NO_IO_THRU_MDS . . 23
5.2. LAYOUTCOMMIT . . . . . . . . . . . . . . . . . . . . . . 22 5.2. LAYOUTCOMMIT . . . . . . . . . . . . . . . . . . . . . . 23
5.3. Interactions Between Devices and Layouts . . . . . . . . 23 5.3. Interactions Between Devices and Layouts . . . . . . . . 23
5.4. Handling Version Errors . . . . . . . . . . . . . . . . . 23 5.4. Handling Version Errors . . . . . . . . . . . . . . . . . 23
6. Striping via Sparse Mapping . . . . . . . . . . . . . . . . . 23 6. Striping via Sparse Mapping . . . . . . . . . . . . . . . . . 24
7. Recovering from Client I/O Errors . . . . . . . . . . . . . . 24 7. Recovering from Client I/O Errors . . . . . . . . . . . . . . 24
8. Mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . 25 8. Mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . 25
8.1. Selecting a Mirror . . . . . . . . . . . . . . . . . . . 25 8.1. Selecting a Mirror . . . . . . . . . . . . . . . . . . . 26
8.2. Writing to Mirrors . . . . . . . . . . . . . . . . . . . 26 8.2. Writing to Mirrors . . . . . . . . . . . . . . . . . . . 26
8.2.1. Single Storage Device Updates Mirrors . . . . . . . . 26 8.2.1. Single Storage Device Updates Mirrors . . . . . . . . 26
8.2.2. Single Storage Device Updates Mirrors . . . . . . . . 26 8.2.2. Client Updates All Mirrors . . . . . . . . . . . . . 26
8.2.3. Handling Write Errors . . . . . . . . . . . . . . . . 26 8.2.3. Handling Write Errors . . . . . . . . . . . . . . . . 27
8.2.4. Handling Write COMMITs . . . . . . . . . . . . . . . 27 8.2.4. Handling Write COMMITs . . . . . . . . . . . . . . . 27
8.3. Metadata Server Resilvering of the File . . . . . . . . . 28 8.3. Metadata Server Resilvering of the File . . . . . . . . . 28
9. Flexible Files Layout Type Return . . . . . . . . . . . . . . 28 9. Flexible Files Layout Type Return . . . . . . . . . . . . . . 28
9.1. I/O Error Reporting . . . . . . . . . . . . . . . . . . . 29 9.1. I/O Error Reporting . . . . . . . . . . . . . . . . . . . 29
9.1.1. ff_ioerr4 . . . . . . . . . . . . . . . . . . . . . . 29 9.1.1. ff_ioerr4 . . . . . . . . . . . . . . . . . . . . . . 29
9.2. Layout Usage Statistics . . . . . . . . . . . . . . . . . 30 9.2. Layout Usage Statistics . . . . . . . . . . . . . . . . . 30
9.2.1. ff_io_latency4 . . . . . . . . . . . . . . . . . . . 30 9.2.1. ff_io_latency4 . . . . . . . . . . . . . . . . . . . 30
9.2.2. ff_layoutupdate4 . . . . . . . . . . . . . . . . . . 31 9.2.2. ff_layoutupdate4 . . . . . . . . . . . . . . . . . . 31
9.2.3. ff_iostats4 . . . . . . . . . . . . . . . . . . . . . 31 9.2.3. ff_iostats4 . . . . . . . . . . . . . . . . . . . . . 32
9.3. ff_layoutreturn4 . . . . . . . . . . . . . . . . . . . . 33 9.3. ff_layoutreturn4 . . . . . . . . . . . . . . . . . . . . 33
10. Flexible Files Layout Type LAYOUTERROR . . . . . . . . . . . 33 10. Flexible Files Layout Type LAYOUTERROR . . . . . . . . . . . 34
11. Flexible Files Layout Type LAYOUTSTATS . . . . . . . . . . . 33 11. Flexible Files Layout Type LAYOUTSTATS . . . . . . . . . . . 34
12. Flexible File Layout Type Creation Hint . . . . . . . . . . . 34 12. Flexible File Layout Type Creation Hint . . . . . . . . . . . 34
12.1. ff_layouthint4 . . . . . . . . . . . . . . . . . . . . . 34 12.1. ff_layouthint4 . . . . . . . . . . . . . . . . . . . . . 35
13. Recalling a Layout . . . . . . . . . . . . . . . . . . . . . 35 13. Recalling a Layout . . . . . . . . . . . . . . . . . . . . . 35
13.1. CB_RECALL_ANY . . . . . . . . . . . . . . . . . . . . . 35 13.1. CB_RECALL_ANY . . . . . . . . . . . . . . . . . . . . . 35
14. Client Fencing . . . . . . . . . . . . . . . . . . . . . . . 36 14. Client Fencing . . . . . . . . . . . . . . . . . . . . . . . 36
15. Security Considerations . . . . . . . . . . . . . . . . . . . 36 15. Security Considerations . . . . . . . . . . . . . . . . . . . 37
15.1. RPCSEC_GSS and Security Services . . . . . . . . . . . . 37 15.1. RPCSEC_GSS and Security Services . . . . . . . . . . . . 38
15.1.1. Loosely Coupled . . . . . . . . . . . . . . . . . . 37 15.1.1. Loosely Coupled . . . . . . . . . . . . . . . . . . 38
15.1.2. Tightly Coupled . . . . . . . . . . . . . . . . . . 38 15.1.2. Tightly Coupled . . . . . . . . . . . . . . . . . . 39
16. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 38 16. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39
17. References . . . . . . . . . . . . . . . . . . . . . . . . . 39 17. References . . . . . . . . . . . . . . . . . . . . . . . . . 40
17.1. Normative References . . . . . . . . . . . . . . . . . . 39 17.1. Normative References . . . . . . . . . . . . . . . . . . 40
17.2. Informative References . . . . . . . . . . . . . . . . . 40 17.2. Informative References . . . . . . . . . . . . . . . . . 41
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 40 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 41
Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 41 Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 42
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 42
1. Introduction 1. Introduction
In the parallel Network File System (pNFS), the metadata server In the parallel Network File System (pNFS), the metadata server
returns layout type structures that describe where file data is returns layout type structures that describe where file data is
located. There are different layout types for different storage located. There are different layout types for different storage
systems and methods of arranging data on storage devices. This systems and methods of arranging data on storage devices. This
document defines the flexible file layout type used with file-based document defines the flexible file layout type used with file-based
data servers that are accessed using the Network File System (NFS) data servers that are accessed using the Network File System (NFS)
protocols: NFSv3 [RFC1813], NFSv4.0 [RFC7530], NFSv4.1 [RFC5661], and protocols: NFSv3 [RFC1813], NFSv4.0 [RFC7530], NFSv4.1 [RFC5661], and
NFSv4.2 [RFC7862]. NFSv4.2 [RFC7862].
To provide a global state model equivalent to that of the files To provide a global state model equivalent to that of the files
layout type, a back-end control protocol might be implemented between layout type, a back-end control protocol might be implemented between
the metadata server and NFSv4.1+ storage devices. This document does the metadata server and NFSv4.1+ storage devices. This document does
not provide a standard track control protocol. An implementation can not provide a standard track control protocol. An implementation can
either define its own mechanism or it could define a control protocol either define its own mechanism or it could define a control protocol
in a standard's track document. The requirements for a control in a standard's track document. The requirements for a control
protocol are specified in [RFC5661] and clarified in [pNFSLayouts]. protocol are specified in [RFC5661] and clarified in [pNFSLayouts].
The control protocol described in this document is based on NFS. The
storage devices are configured such that the metadata server has full
access rights to the data file system and then the metadata server
uses synthetic ids to control client access to individual files.
In traditional mirroring of data, the server is responsible for
replicating, validating, and repairing copies of the data file. With
client-side mirroring, the metadata server provides a layout which
presents the available mirrors to the client. It is then the client
which picks a mirror to read from and ensures that all writes go to
all mirrors. Only if all mirrors are successfully updated, does the
client consider the write transaction to have succeeded. In case of
error, the client can use the LAYOUTERROR operation to inform the
metadata server, which is then responsible for the repairing of the
mirrored copies of the file.
1.1. Definitions 1.1. Definitions
control communication requirements: are for a layout type the control communication requirements: are for a layout type the
details regarding information on layouts, stateids, file metadata, details regarding information on layouts, stateids, file metadata,
and file data which must be communicated between the metadata and file data which must be communicated between the metadata
server and the storage devices. server and the storage devices.
control protocol: is the particular mechanism that an implementation control protocol: is the particular mechanism that an implementation
of a layout type would use to meet the control communication of a layout type would use to meet the control communication
requirement for that layout type. This need not be a protocol as requirement for that layout type. This need not be a protocol as
skipping to change at page 4, line 23 skipping to change at page 4, line 39
data server (DS): is another term for storage device. data server (DS): is another term for storage device.
fencing: is the process by which the metadata server prevents the fencing: is the process by which the metadata server prevents the
storage devices from processing I/O from a specific client to a storage devices from processing I/O from a specific client to a
specific file. specific file.
file layout type: is a layout type in which the storage devices are file layout type: is a layout type in which the storage devices are
accessed via the NFS protocol (see Section 13 of [RFC5661]). accessed via the NFS protocol (see Section 13 of [RFC5661]).
gid: is the group id, a numeric value which identifies to which
group a file belongs.
layout: is the information a client uses to access file data on a layout: is the information a client uses to access file data on a
storage device. This information will include specification of storage device. This information will include specification of
the protocol (layout type) and the identity of the storage devices the protocol (layout type) and the identity of the storage devices
to be used. to be used.
layout iomode: is a grant of either read or read/write I/O to the layout iomode: is a grant of either read or read/write I/O to the
client. client.
layout segment: is a sub-division of a layout. That sub-division layout segment: is a sub-division of a layout. That sub-division
might be by the layout iomode (see Sections 3.3.20 and 12.2.9 of might be by the layout iomode (see Sections 3.3.20 and 12.2.9 of
skipping to change at page 5, line 50 skipping to change at page 6, line 21
storage protocol: is the protocol used by clients to do I/O storage protocol: is the protocol used by clients to do I/O
operations to the storage device. Each layout type specifies the operations to the storage device. Each layout type specifies the
set of storage protocols. set of storage protocols.
tight coupling: is an arrangement in which the control protocol is tight coupling: is an arrangement in which the control protocol is
one designed specifically for that purpose. It may be either a one designed specifically for that purpose. It may be either a
proprietary protocol, adapted specifically to a a particular proprietary protocol, adapted specifically to a a particular
metadata server, or one based on a standards-track document. metadata server, or one based on a standards-track document.
uid: is the used id, a numeric value which identifies which user
owns a file.
wsize: is the data transfer buffer size used for writes. wsize: is the data transfer buffer size used for writes.
1.2. Requirements Language 1.2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
2. Coupling of Storage Devices 2. Coupling of Storage Devices
A server implementation may choose either a loose or tight coupling A server implementation may choose either a loose or tight coupling
model between the metadata server and the storage devices. To model between the metadata server and the storage devices.
implement the tight coupling model, a control protocol has to be [pNFSLayouts] describes the general problems facing pNFS
defined. As the flex file layout imposes no special requirements on implementations. This document details how the new Flexible File
the client, the control protocol will need to provide: Layout Type addresses these issues. To implement the tight coupling
model, a control protocol has to be defined. As the flex file layout
imposes no special requirements on the client, the control protocol
will need to provide:
(1) for the management of both security and LAYOUTCOMMITs, and, (1) for the management of both security and LAYOUTCOMMITs, and,
(2) a global stateid model and management of these stateids. (2) a global stateid model and management of these stateids.
When implementing the loose coupling model, the only control protocol When implementing the loose coupling model, the only control protocol
will be a version of NFS, with no ability to provide a global stateid will be a version of NFS, with no ability to provide a global stateid
model or to prevent clients from using layouts inappropriately. To model or to prevent clients from using layouts inappropriately. To
enable client use in that environment, this document will specify how enable client use in that environment, this document will specify how
security, state, and locking are to be managed. security, state, and locking are to be managed.
skipping to change at page 6, line 50 skipping to change at page 7, line 26
about the changes to the file. If any WRITE to a storage device did about the changes to the file. If any WRITE to a storage device did
not result with stable_how equal to FILE_SYNC, a LAYOUTCOMMIT to the not result with stable_how equal to FILE_SYNC, a LAYOUTCOMMIT to the
metadata server MUST be preceded by a COMMIT to the storage devices metadata server MUST be preceded by a COMMIT to the storage devices
written to. Note that if the client has not done a COMMIT to the written to. Note that if the client has not done a COMMIT to the
storage device, then the LAYOUTCOMMIT might not be synchronized to storage device, then the LAYOUTCOMMIT might not be synchronized to
the last WRITE operation to the storage device. the last WRITE operation to the storage device.
2.2. Fencing Clients from the Storage Device 2.2. Fencing Clients from the Storage Device
With loosely coupled storage devices, the metadata server uses With loosely coupled storage devices, the metadata server uses
synthetic uids and gids for the data file, where the uid owner of the synthetic uids (user ids) and gids (group ids) for the data file,
data file is allowed read/write access and the gid owner is allowed where the uid owner of the data file is allowed read/write access and
read only access. As part of the layout (see ffds_user and the gid owner is allowed read only access. As part of the layout
ffds_group in Section 5.1), the client is provided with the user and (see ffds_user and ffds_group in Section 5.1), the client is provided
group to be used in the Remote Procedure Call (RPC) [RFC5531] with the user and group to be used in the Remote Procedure Call (RPC)
credentials needed to access the data file. Fencing off of clients [RFC5531] credentials needed to access the data file. Fencing off of
is achieved by the metadata server changing the synthetic uid and/or clients is achieved by the metadata server changing the synthetic uid
gid owners of the data file on the storage device to implicitly and/or gid owners of the data file on the storage device to
revoke the outstanding RPC credentials. A client presenting the implicitly revoke the outstanding RPC credentials. A client
wrong credential for the desired access will get a NFS4ERR_ACCESS presenting the wrong credential for the desired access will get a
error. NFS4ERR_ACCESS error.
With this loosely coupled model, the metadata server is not able to With this loosely coupled model, the metadata server is not able to
fence off a single client, it is forced to fence off all clients. fence off a single client, it is forced to fence off all clients.
However, as the other clients react to the fencing, returning their However, as the other clients react to the fencing, returning their
layouts and trying to get new ones, the metadata server can hand out layouts and trying to get new ones, the metadata server can hand out
a new uid and gid to allow access. a new uid and gid to allow access.
Note: it is recommended to implement common access control methods at It is RECOMMENDED to implement common access control methods at the
the storage device filesystem to allow only the metadata server root storage device filesystem to allow only the metadata server root
(super user) access to the storage device, and to set the owner of (super user) access to the storage device, and to set the owner of
all directories holding data files to the root user. This approach all directories holding data files to the root user. This approach
provides a practical model to enforce access control and fence off provides a practical model to enforce access control and fence off
cooperative clients, but it can not protect against malicious cooperative clients, but it can not protect against malicious
clients; hence it provides a level of security equivalent to clients; hence it provides a level of security equivalent to
AUTH_SYS. AUTH_SYS. It is RECOMMENDED that the communication between the
metadata server and storage device be secure from eavesdroppers and
man-in-the-middle protocol tampering. The security measure could be
due to physical security (e.g., the servers are co-located in a
physically secure area), from encrypted communications, or some other
technique.
With tightly coupled storage devices, the metadata server sets the With tightly coupled storage devices, the metadata server sets the
user and group owners, mode bits, and ACL of the data file to be the user and group owners, mode bits, and ACL of the data file to be the
same as the metadata file. And the client must authenticate with the same as the metadata file. And the client must authenticate with the
storage device and go through the same authorization process it would storage device and go through the same authorization process it would
go through via the metadata server. In the case of tight coupling, go through via the metadata server. In the case of tight coupling,
fencing is the responsibility of the control protocol and is not fencing is the responsibility of the control protocol and is not
described in detail here. However, implementations of the tight described in detail here. However, implementations of the tight
coupling locking model (see Section 2.3), will need a way to prevent coupling locking model (see Section 2.3), will need a way to prevent
access by certain clients to specific files by invalidating the access by certain clients to specific files by invalidating the
skipping to change at page 8, line 45 skipping to change at page 9, line 21
2.2.2. Example of using Synthetic uids/gids 2.2.2. Example of using Synthetic uids/gids
The user loghyr creates a file "ompha.c" on the metadata server and The user loghyr creates a file "ompha.c" on the metadata server and
it creates a corresponding data file on the storage device. it creates a corresponding data file on the storage device.
The metadata server entry may look like: The metadata server entry may look like:
-rw-r--r-- 1 loghyr staff 1697 Dec 4 11:31 ompha.c -rw-r--r-- 1 loghyr staff 1697 Dec 4 11:31 ompha.c
On the storage device, it may be assigned some random synthetic uid/ On the storage device, it may be assigned some unpredictable
gid to deny access: synthetic uid/gid to deny access:
-rw-r----- 1 19452 28418 1697 Dec 4 11:31 data_ompha.c -rw-r----- 1 19452 28418 1697 Dec 4 11:31 data_ompha.c
When the file is opened on a client, since the layout knows nothing When the file is opened on a client and accessed, it will try to get
about the user (and does not care), whether loghyr or garbo opens the a layout for the data file. Since the layout knows nothing about the
file does not matter. The owner and group are modified and those user (and does not care), whether the user loghyr or garbo opens the
values are returned. file does not matter. The client has to present an uid of 19452 to
get write permission. If it presents any other value for the uid,
then it must give a gid of 28418 to get read access.
-rw-r----- 1 1066 1067 1697 Dec 4 11:31 data_ompha.c Further, if the metadata server decides to fence the file, it should
change the uid and/or gid such that these values neither match
earlier values for that file nor match a predictable change based on
an earlier fencing.
-rw-r----- 1 19453 28419 1697 Dec 4 11:31 data_ompha.c
The set of synthetic gids on the storage device should be selected The set of synthetic gids on the storage device should be selected
such that there is no mapping in any of the name services used by the such that there is no mapping in any of the name services used by the
storage device. I.e., each group should have no members. storage device. I.e., each group should have no members.
If the layout segment has an iomode of LAYOUTIOMODE4_READ, then the If the layout segment has an iomode of LAYOUTIOMODE4_READ, then the
metadata server should return a synthetic uid that is not set on the metadata server should return a synthetic uid that is not set on the
storage device. Only the synthetic gid would be valid. storage device. Only the synthetic gid would be valid.
The client is thus solely responsible for enforcing file permissions The client is thus solely responsible for enforcing file permissions
skipping to change at page 26, line 28 skipping to change at page 26, line 43
this case, the storage device MUST ensure that all copies of the this case, the storage device MUST ensure that all copies of the
mirror are updated when any one of the mirrors is updated. If the mirror are updated when any one of the mirrors is updated. If the
storage device gets an error when updating one of the mirrors, then storage device gets an error when updating one of the mirrors, then
it MUST inform the client that the original WRITE had an error. The it MUST inform the client that the original WRITE had an error. The
client then MUST inform the metadata server (see Section 8.2.3). The client then MUST inform the metadata server (see Section 8.2.3). The
client's responsibility with respect to COMMIT is explained in client's responsibility with respect to COMMIT is explained in
Section 8.2.4. The client may choose any one of the mirrors and may Section 8.2.4. The client may choose any one of the mirrors and may
use ffds_efficiency in the same manner as for reading when making use ffds_efficiency in the same manner as for reading when making
this choice. this choice.
8.2.2. Single Storage Device Updates Mirrors 8.2.2. Client Updates All Mirrors
If the FF_FLAGS_WRITE_ONE_MIRROR flag in ffl_flags is not set, the If the FF_FLAGS_WRITE_ONE_MIRROR flag in ffl_flags is not set, the
client is responsible for updating all mirrored copies of the layout client is responsible for updating all mirrored copies of the layout
segments that it is given in the layout. A single failed update is segments that it is given in the layout. A single failed update is
sufficient to fail the entire operation. If all but one copy is sufficient to fail the entire operation. If all but one copy is
updated successfully and the last one provides an error, then the updated successfully and the last one provides an error, then the
client needs to inform the metadata server about the error via either client needs to inform the metadata server about the error via either
LAYOUTRETURN or LAYOUTERROR that the update failed to that storage LAYOUTRETURN or LAYOUTERROR that the update failed to that storage
device. If the client is updating the mirrors serially, then it device. If the client is updating the mirrors serially, then it
SHOULD stop at the first error encountered and report that to the SHOULD stop at the first error encountered and report that to the
skipping to change at page 37, line 18 skipping to change at page 37, line 39
server verifies the permissions and ACL for these credentials, server verifies the permissions and ACL for these credentials,
possibly returning NFS4ERR_ACCESS if the client is not allowed the possibly returning NFS4ERR_ACCESS if the client is not allowed the
requested iomode. If the LAYOUTGET operation succeeds the client requested iomode. If the LAYOUTGET operation succeeds the client
receives, as part of the layout, a set of credentials allowing it I/O receives, as part of the layout, a set of credentials allowing it I/O
access to the specified data files corresponding to the requested access to the specified data files corresponding to the requested
iomode. When the client acts on I/O operations on behalf of its iomode. When the client acts on I/O operations on behalf of its
local users, it MUST authenticate and authorize the user by issuing local users, it MUST authenticate and authorize the user by issuing
respective OPEN and ACCESS calls to the metadata server, similar to respective OPEN and ACCESS calls to the metadata server, similar to
having NFSv4 data delegations. having NFSv4 data delegations.
The combination of file handle, synthetic uid, and gid in the layout
are the way that the metadata server enforces access control to the
data server. The directory namespace on the storage device SHOULD
only be accessible to the metadata server and not the clients. In
that case, the client only has access to file handles of file objects
and not directory objects. Thus, given a file handle in a layout, it
is not possible to guess the parent directory file handle. Further,
as the data file permissions only allow the given synthetic uid read/
write permission and the given synthetic gid read permission, knowing
the synthetic ids of one file does not necessarily allow access to
any other data file on the storage device.
The metadata server can also deny access at any time by fencing the
data file, which means changing the synthetic ids. In turn, that
forces the client to return its current layout and get a new layout
if it wants to continue IO to the data file.
If the configuration of the storage device is such that clients can
access the directory namespace, then the access control degrades to
that of a typical NFS server with exports with a security flavor of
AUTH_SYS. Any client which is allowed access can forge credentials
to access any data file. The caveat is that the rogue client might
have no knowledge of the data file's type or position in the metadata
directory namespace.
If access is allowed, the client uses the corresponding (READ or RW) If access is allowed, the client uses the corresponding (READ or RW)
credentials to perform the I/O operations at the data file's storage credentials to perform the I/O operations at the data file's storage
devices. When the metadata server receives a request to change a devices. When the metadata server receives a request to change a
file's permissions or ACL, it SHOULD recall all layouts for that file file's permissions or ACL, it SHOULD recall all layouts for that file
and then MUST fence off any clients still holding outstanding layouts and then MUST fence off any clients still holding outstanding layouts
for the respective files by implicitly invalidating the previously for the respective files by implicitly invalidating the previously
distributed credential on all data file comprising the file in distributed credential on all data file comprising the file in
question. It is REQUIRED that this be done before committing to the question. It is REQUIRED that this be done before committing to the
new permissions and/or ACL. By requesting new layouts, the clients new permissions and/or ACL. By requesting new layouts, the clients
will reauthorize access against the modified access control metadata. will reauthorize access against the modified access control metadata.
 End of changes. 30 change blocks. 
62 lines changed or deleted 124 lines changed or added

This html diff was produced by rfcdiff 1.46. The latest version is available from http://tools.ietf.org/tools/rfcdiff/