draft-ietf-nfsv4-pnfs-obj-03.txt   draft-ietf-nfsv4-pnfs-obj-04.txt 
NFSv4 B. Halevy NFSv4 B. Halevy
Internet-Draft B. Welch Internet-Draft B. Welch
Intended status: Standards Track J. Zelenka Intended status: Standards Track J. Zelenka
Expires: September 6, 2007 Panasas Expires: March 8, 2008 Panasas
March 5, 2007 September 5, 2007
Object-based pNFS Operations Object-based pNFS Operations
draft-ietf-nfsv4-pnfs-obj-03.txt draft-ietf-nfsv4-pnfs-obj-04
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 35 skipping to change at page 1, line 35
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on September 6, 2007. This Internet-Draft will expire on March 8, 2008.
Copyright Notice Copyright Notice
Copyright (C) The IETF Trust (2007). Copyright (C) The IETF Trust (2007).
Abstract Abstract
This Internet-Draft provides a description of the object-based pNFS This Internet-Draft provides a description of the object-based pNFS
extension for NFSv4. This is a companion to the main pnfs extension for NFSv4. This is a companion to the main pnfs
specification in the NFSv4 Minor Version 1 Internet Draft, which is specification in the NFSv4 Minor Version 1 Internet Draft, which is
currently draft-ietf-nfsv4-minorversion1-10.txt. currently draft-ietf-nfsv4-minorversion1-13.txt.
Requirements Language Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [1]. document are to be interpreted as described in RFC 2119 [1].
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Object Storage Device Addressing and Discovery . . . . . . . . 4 2. Object Storage Device Addressing and Discovery . . . . . . . . 4
2.1. pnfs_osd_addr_type4 . . . . . . . . . . . . . . . . . . . 5 2.1. pnfs_osd_addr_type4 . . . . . . . . . . . . . . . . . . . 5
2.2. pnfs_osd_deviceaddr4 . . . . . . . . . . . . . . . . . . . 5 2.2. pnfs_osd_deviceaddr4 . . . . . . . . . . . . . . . . . . . 6
3. Object-Based Layout . . . . . . . . . . . . . . . . . . . . . 5 3. Object-Based Layout . . . . . . . . . . . . . . . . . . . . . 6
3.1. pnfs_osd_layout4 . . . . . . . . . . . . . . . . . . . . . 6 3.1. pnfs_osd_layout4 . . . . . . . . . . . . . . . . . . . . . 7
3.1.1. pnfs_osd_objid4 . . . . . . . . . . . . . . . . . . . 7 3.1.1. pnfs_osd_objid4 . . . . . . . . . . . . . . . . . . . 7
3.1.2. pnfs_osd_version4 . . . . . . . . . . . . . . . . . . 8
3.1.3. pnfs_osd_object_cred4 . . . . . . . . . . . . . . . . 8 3.1.3. pnfs_osd_object_cred4 . . . . . . . . . . . . . . . . 8
3.1.4. pnfs_osd_raid_algorithm4 . . . . . . . . . . . . . . . 8 3.1.4. pnfs_osd_raid_algorithm4 . . . . . . . . . . . . . . . 10
3.1.5. pnfs_osd_data_map4 . . . . . . . . . . . . . . . . . . 8 3.1.5. pnfs_osd_data_map4 . . . . . . . . . . . . . . . . . . 10
3.2. Data Mapping Schemes . . . . . . . . . . . . . . . . . . . 9 3.2. Data Mapping Schemes . . . . . . . . . . . . . . . . . . . 11
3.2.1. Simple Striping . . . . . . . . . . . . . . . . . . . 9 3.2.1. Simple Striping . . . . . . . . . . . . . . . . . . . 11
3.2.2. Nested Striping . . . . . . . . . . . . . . . . . . . 10 3.2.2. Nested Striping . . . . . . . . . . . . . . . . . . . 12
3.2.3. Mirroring . . . . . . . . . . . . . . . . . . . . . . 12 3.2.3. Mirroring . . . . . . . . . . . . . . . . . . . . . . 13
3.3. RAID Algorithms . . . . . . . . . . . . . . . . . . . . . 13 3.3. RAID Algorithms . . . . . . . . . . . . . . . . . . . . . 14
3.3.1. PNFS_OSD_RAID_0 . . . . . . . . . . . . . . . . . . . 13 3.3.1. PNFS_OSD_RAID_0 . . . . . . . . . . . . . . . . . . . 14
3.3.2. PNFS_OSD_RAID_4 . . . . . . . . . . . . . . . . . . . 13 3.3.2. PNFS_OSD_RAID_4 . . . . . . . . . . . . . . . . . . . 14
3.3.3. PNFS_OSD_RAID_5 . . . . . . . . . . . . . . . . . . . 13 3.3.3. PNFS_OSD_RAID_5 . . . . . . . . . . . . . . . . . . . 15
3.3.4. PNFS_OSD_RAID_PQ . . . . . . . . . . . . . . . . . . . 14 3.3.4. PNFS_OSD_RAID_PQ . . . . . . . . . . . . . . . . . . . 15
3.3.5. RAID Usage and implementation notes . . . . . . . . . 14 3.3.5. RAID Usage and implementation notes . . . . . . . . . 16
4. Object-Based Layout Update . . . . . . . . . . . . . . . . . . 15 4. Object-Based Layout Update . . . . . . . . . . . . . . . . . . 16
4.1. pnfs_osd_layoutupdate4 . . . . . . . . . . . . . . . . . . 15 4.1. pnfs_osd_layoutupdate4 . . . . . . . . . . . . . . . . . . 16
4.1.1. pnfs_osd_deltaspaceused4 . . . . . . . . . . . . . . . 15 4.1.1. pnfs_osd_deltaspaceused4 . . . . . . . . . . . . . . . 17
4.1.2. pnfs_osd_errno4 . . . . . . . . . . . . . . . . . . . 16 4.1.2. pnfs_osd_errno4 . . . . . . . . . . . . . . . . . . . 17
4.1.3. pnfs_osd_ioerr4 . . . . . . . . . . . . . . . . . . . 17 4.1.3. pnfs_osd_ioerr4 . . . . . . . . . . . . . . . . . . . 18
5. Object-Based Creation Layout Hint . . . . . . . . . . . . . . 17 5. Object-Based Creation Layout Hint . . . . . . . . . . . . . . 19
5.1. pnfs_osd_layouthint4 . . . . . . . . . . . . . . . . . . . 17 5.1. pnfs_osd_layouthint4 . . . . . . . . . . . . . . . . . . . 19
6. Layout Segments . . . . . . . . . . . . . . . . . . . . . . . 19 6. Layout Segments . . . . . . . . . . . . . . . . . . . . . . . 20
6.1. CB_LAYOUTRECALL and LAYOUTRETURN . . . . . . . . . . . . . 19 6.1. CB_LAYOUTRECALL and LAYOUTRETURN . . . . . . . . . . . . . 20
6.2. LAYOUTCOMMIT . . . . . . . . . . . . . . . . . . . . . . . 19 6.2. LAYOUTCOMMIT . . . . . . . . . . . . . . . . . . . . . . . 21
7. Recalling Layouts . . . . . . . . . . . . . . . . . . . . . . 20 7. Recalling Layouts . . . . . . . . . . . . . . . . . . . . . . 21
8. Security Considerations . . . . . . . . . . . . . . . . . . . 20 7.1. CB_RECALL_ANY . . . . . . . . . . . . . . . . . . . . . . 22
8.1. OSD Security Data Types . . . . . . . . . . . . . . . . . 20 8. Client Fencing . . . . . . . . . . . . . . . . . . . . . . . . 22
8.2. The OSD Security Protocol . . . . . . . . . . . . . . . . 21 9. Security Considerations . . . . . . . . . . . . . . . . . . . 23
8.3. Revoking capabilities . . . . . . . . . . . . . . . . . . 22 9.1. OSD Security Data Types . . . . . . . . . . . . . . . . . 24
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 9.2. The OSD Security Protocol . . . . . . . . . . . . . . . . 24
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 9.3. Protocol Privacy Requirements . . . . . . . . . . . . . . 25
10.1. Normative References . . . . . . . . . . . . . . . . . . . 23 9.4. Revoking Capabilities . . . . . . . . . . . . . . . . . . 26
10.2. Informative References . . . . . . . . . . . . . . . . . . 24 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . . 24 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24 11.1. Normative References . . . . . . . . . . . . . . . . . . . 27
Intellectual Property and Copyright Statements . . . . . . . . . . 26 11.2. Informative References . . . . . . . . . . . . . . . . . . 27
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . . 27
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 28
Intellectual Property and Copyright Statements . . . . . . . . . . 29
1. Introduction 1. Introduction
In pNFS, the file server returns typed layout structures that In pNFS, the file server returns typed layout structures that
describe where file data is located. There are different layouts for describe where file data is located. There are different layouts for
different storage systems and methods of arranging data on storage different storage systems and methods of arranging data on storage
devices. This document describes the layouts used with object-based devices. This document describes the layouts used with object-based
storage devices (OSD) that are accessed according to the iSCSI/OSD storage devices (OSD) that are accessed according to the iSCSI/OSD
storage protocol standard (SNIA T10/1355-D [2]). storage protocol standard (SNIA T10/1355-D [2]).
An "object" is a container for data and attributes, and files are An "object" is a container for data and attributes, and files are
stored in one or more objects. The OSD protocol specifies several stored in one or more objects. The OSD protocol specifies several
operations on objects, including READ, WRITE, FLUSH, GETATTR, operations on objects, including READ, WRITE, FLUSH, GET ATTRIBUTES,
SETATTR, CREATE and DELETE. However, in this proposal the client SET ATTRIBUTES, CREATE and DELETE. However, using the object-based
only uses the READ, WRITE, GETATTR and FLUSH commands. The other layout the client only uses the READ, WRITE, GET ATTRIBUTES and FLUSH
commands are only used by the pNFS server. commands. The other commands are only used by the pNFS server.
An object-based layout for pNFS includes object identifiers, An object-based layout for pNFS includes object identifiers,
capabilities that allow clients to READ or WRITE those objects, and capabilities that allow clients to READ or WRITE those objects, and
various parameters that control how file data is striped across their various parameters that control how file data is striped across their
component objects. The OSD protocol has a capability-based security component objects. The OSD protocol has a capability-based security
scheme that allows the pNFS server to control what operations and scheme that allows the pNFS server to control what operations and
what objects are used by clients. This scheme is described in more what objects can be used by clients. This scheme is described in
detail in the "Security Considerations" section (Section 8). more detail in the Security Considerations section (Section 9).
2. Object Storage Device Addressing and Discovery 2. Object Storage Device Addressing and Discovery
Data operations to an OSD require the client to know the "address" of Data operations to an OSD require the client to know the "address" of
each OSD's root object. The root object is synonymous with SCSI each OSD's root object. The root object is synonymous with SCSI
logical unit. The client specifies SCSI logical units to its SCSI logical unit. The client specifies SCSI logical units to its SCSI
stack using a representation local to the client. Because these stack using a representation local to the client. Because these
representations are local, GETDEVICEINFO must return information that representations are local, GETDEVICEINFO must return information that
can be used by the client to select the correct local representation. can be used by the client to select the correct local representation.
In the block world, a set offset (logical block number or track/ In the block world, a set offset (logical block number or track/
sector) contains a disk label. This label identifies the disk sector) contains a disk label. This label identifies the disk
uniquely. In contrast, an OSD has a standard set of attributes on uniquely. In contrast, an OSD has a standard set of attributes on
its root object. For device identification purposes, the OSD name its root object. For device identification purposes the OSD System
(root information attribute number 9) will be used as the label. ID (root information attribute number 3) and/or OSD Name (root
This appears in the pnfs_osd_deviceaddr4 type below under the information attribute number 9) are used as the label. These appear
"root_id" field. in the pnfs_osd_deviceaddr4 type below under the "systemid" and
"osdname" fields.
In some situations, SCSI target discovery may need to be driven based In some situations, SCSI target discovery may need to be driven based
on information contained in the GETDEVICEINFO response. One example on information contained in the GETDEVICEINFO response. One example
of this is iSCSI targets that are not known to the client until a of this is iSCSI targets that are not known to the client until a
layout has been requested. Eventually iSCSI will adopt ANSI T10 layout has been requested. Eventually iSCSI will adopt ANSI T10
SAM-3, at which time the World Wide Name (WWN aka, EUI-64/EUI-128) SAM-3, at which time the World Wide Name (WWN aka, EUI-64/EUI-128)
naming conventions can be specified. In addition, Fibre Channel (FC) naming conventions can be specified. In addition, Fibre Channel (FC)
SCSI targets have a unique WWN. Although these FC targets have SCSI targets have a unique WWN. Although these FC targets have
already been discovered, some implementations may want to specify the already been discovered, some implementations may want to specify the
WWN in addition to the label. This information appears as the WWN in addition to the label. This information appears as the
"target" and "lun" fields in the pnfs_osd_deviceaddr4 type described "target" and "lun" fields in the pnfs_osd_deviceaddr4 type described
below. below.
The systemid is used by the client, along with the object credential
to sign each request with the request integrity check value. This
method protects the client from unintentionally accessing a device if
the device address mapping was changed (or revoked). The server
computes the capability_key using its own view of the systemid
associated with the respective deviceid present in the credential.
If the client's view of the deviceid mapping is stale, the client
will use the wrong systemid (which must be system-wide unique) and
the I/O request to the OSD will fail to pass the integrity check
verification.
To recover from this condition the client should report the error via
LAYOUTCOMMIT, return the layout using LAYOUTRETURN, and invalidate
all the device address mappings associated with this layout. The
client can then ask for a new layout if it wishes using LAYOUTGET and
resolve the referenced deviceids using GETDEVICEINFO or
GETDEVICELIST.
The server MUST provide either the systemid, the OSD name, or both.
When the OSD name is present the client SHOULD get the root
information attributes whenever it establishes communication with the
OSD and verify that the OSD name it got from the OSD matches the one
sent by the metadata server. If the systemid was not given by the
server it MUST be taken from the OSD-provided attribute; note that in
this case the OSD GET ATTRIBUTES operation must be performed with the
NOSEC security method.
2.1. pnfs_osd_addr_type4 2.1. pnfs_osd_addr_type4
The following enum specifies the manner in which a scsi target can be The following enum specifies the manner in which a scsi target can be
specified. The target can be specified as a network address, as an specified. The target can be specified as a network address, as an
Internet Qualified Name (IQN), or by the World-Wide Name (WWN) of the Internet Qualified Name (IQN), or by the World-Wide Name (WWN) of the
target. target.
enum pnfs_obj_addr_type4 { enum pnfs_obj_addr_type4 {
OBJ_TARGET_NETADDR = 1, OBJ_TARGET_NETADDR = 1,
OBJ_TARGET_IQN = 2, OBJ_TARGET_IQN = 2,
skipping to change at page 5, line 42 skipping to change at page 6, line 24
case OBJ_TARGET_IQN: case OBJ_TARGET_IQN:
string iqn<>; string iqn<>;
case OBJ_TARGET_WWN: case OBJ_TARGET_WWN:
string wwn<>; string wwn<>;
default: default:
void; void;
}; };
uint64_t lun; uint64_t lun;
opaque root_id<>; opaque systemid<>;
opaque osdname<>;
}; };
3. Object-Based Layout 3. Object-Based Layout
The layout4 type is defined in the NFSv4.1 draft [5] as follows: The layout4 type is defined in the NFSv4.1 draft [6] as follows:
enum layouttype4 { enum layouttype4 {
LAYOUT4_NFSV4_1_FILES = 1, LAYOUT4_NFSV4_1_FILES = 1,
LAYOUT4_OSD2_OBJECTS = 2, LAYOUT4_OSD2_OBJECTS = 2,
LAYOUT4_BLOCK_VOLUME = 3 LAYOUT4_BLOCK_VOLUME = 3
}; };
struct layout_content4 { struct layout_content4 {
layouttype4 loc_type; layouttype4 loc_type;
opaque loc_body<>; opaque loc_body<>;
}; };
struct layout4 { struct layout4 {
offset4 lo_offset; offset4 lo_offset;
length4 lo_length; length4 lo_length;
layoutiomode4 lo_iomode; layoutiomode4 lo_iomode;
layout_content4 lo_content; layout_content4 lo_content;
}; };
This draft defines structure associated with the layouttype4 value, This document defines structure associated with the layouttype4
LAYOUT4_OSD2_OBJECTS. The NFSv4.1 draft specifies the loc_body value, LAYOUT4_OSD2_OBJECTS. The NFSv4.1 draft [6] specifies the
structure as an XDR type "opaque". The opaque layout is loc_body structure as an XDR type "opaque". The opaque layout is
uninterpreted by the generic pNFS client layers, but obviously must uninterpreted by the generic pNFS client layers, but obviously must
be interpreted by the object-storage layout driver. This document be interpreted by the object-storage layout driver. This document
defines the structure of this opaque value, pnfs_osd_layout4. defines the structure of this opaque value, pnfs_osd_layout4.
3.1. pnfs_osd_layout4 3.1. pnfs_osd_layout4
struct pnfs_osd_layout4 { struct pnfs_osd_layout4 {
pnfs_osd_data_map4 map; pnfs_osd_data_map4 map;
pnfs_osd_object_cred4 components<>; pnfs_osd_object_cred4 components<>;
}; };
skipping to change at page 7, line 47 skipping to change at page 8, line 27
PNFS_OSD_VERSION_2 = 2 PNFS_OSD_VERSION_2 = 2
}; };
The osd_version is used to indicate the OSD protocol version or The osd_version is used to indicate the OSD protocol version or
whether an object is missing (i.e., unavailable). Some layout whether an object is missing (i.e., unavailable). Some layout
schemes encode redundant information and can compensate for missing schemes encode redundant information and can compensate for missing
components, but the data placement algorithm needs to know what parts components, but the data placement algorithm needs to know what parts
are missing. are missing.
At this time the OSD standard is at version 1.0, and we anticipate a At this time the OSD standard is at version 1.0, and we anticipate a
version 2.0 of the standard ((SNIA T10/1729-D [6])). The second version 2.0 of the standard ((SNIA T10/1729-D [7])). The second
generation OSD protocol has additional proposed features to support generation OSD protocol has additional proposed features to support
more robust error recovery, snapshots, and byte-range capabilities. more robust error recovery, snapshots, and byte-range capabilities.
Therefore, the OSD version is explicitly called out in the Therefore, the OSD version is explicitly called out in the
information returned in the layout. (This information can also be information returned in the layout. (This information can also be
deduced by looking inside the capability type at the format field, deduced by looking inside the capability type at the format field,
which is the first byte. The format value is 0x1 for an OSD v1 which is the first byte. The format value is 0x1 for an OSD v1
capability. However, it seems most robust to call out the version capability. However, it seems most robust to call out the version
explicitly.) explicitly.)
3.1.3. pnfs_osd_object_cred4 3.1.3. pnfs_osd_object_cred4
enum pnfs_osd_cap_key_sec4 {
PNFS_OSD_CAP_KEY_SEC_NONE = 0,
PNFS_OSD_CAP_KEY_SEC_SSV = 1,
};
struct pnfs_osd_object_cred4 { struct pnfs_osd_object_cred4 {
pnfs_osd_objid4 object_id; pnfs_osd_objid4 object_id;
pnfs_osd_version4 osd_version; pnfs_osd_version4 osd_version;
opaque credential<>; pnfs_osd_cap_key_sec4 cap_key_sec;
opaque capability_key<>;
opaque capability<>;
}; };
The pnfs_osd_object_cred4 structure is used to identify each The pnfs_osd_object_cred4 structure is used to identify each
component comprising the file. The object_id identifies the component comprising the file. The object_id identifies the
component object, the osd_version represents the osd protocol component object, the osd_version represents the osd protocol
version, or whether that component is unavailable, and the credential version, or whether that component is unavailable, and the capability
provides the OSD security credentials needed to access that object and capability key, along with the systemid from the
(see Section 8.1 for more details). pnfs_osd_deviceaddr, provide the OSD security credentials needed to
access that object. The cap_key_sec value denotes the method used to
secure the capability_key (see Section 9.1 for more details).
To comply with the OSD security requirements the capability key
SHOULD be transferred securely to prevent eavesdropping (see
Section 9). Therefore, a client SHOULD either issue the LAYOUTGET
operation via RPCSEC_GSS with the privacy service or to previously
establish an SSV for the sessions via the NFSv4.1 SET_SSV operation.
The pnfs_osd_cap_key_sec4 type is used to identify the method used by
the server to secure the capability key.
o PNFS_OSD_CAP_KEY_SEC_NONE denotes that the capability_key is not
encrypted in which case the client SHOULD issue the LAYOUTGET
operation with RPCSEC_GSS with the privacy service or the NFSv4.1
transport should be secured by using methods that are external to
NFSv4.1 like the use of IPSEC [8] for transporting the NFSV4.1
protocol.
o PNFS_OSD_CAP_KEY_SEC_SSV denotes that the capability_key contents
are encrypted using the SSV GSS context and the capability key as
inputs to the GSS_Wrap() function (see [3]) with the conf_req_flag
set to TRUE. The client MUST use the secret SSV key as part of
the client's GSS context to decrypt the capability key using the
value of the capability_key field as the input_message to the
GSS_unwrap() function. Note that to prevent eavesdropping of the
SSV key the client SHOULD issue SET_SSV via RPCSEC_GSS with the
privacy service.
The actual method chosen depends on whether the client established a
SSV key with the server and whether it issued the LAYOUTGET operation
with the RPCSEC_GSS privacy method. Naturally, if the client did not
establish a SSV key via SET_SSV the server MUST use the
PNFS_OSD_CAP_KEY_SEC_NONE method. Otherwise, if the LAYOUTGET
operation was not issued with the RPCSEC_GSS privacy method the
server SHOULD secure the capability_key with the
PNFS_OSD_CAP_KEY_SEC_SSV method. The server MAY use the
PNFS_OSD_CAP_KEY_SEC_SSV method also when the LAYOUTGET operation was
issued with the RPCSEC_GSS privacy method.
3.1.4. pnfs_osd_raid_algorithm4 3.1.4. pnfs_osd_raid_algorithm4
enum pnfs_osd_raid_algorithm4 { enum pnfs_osd_raid_algorithm4 {
PNFS_OSD_RAID_0 = 1, PNFS_OSD_RAID_0 = 1,
PNFS_OSD_RAID_4 = 2, PNFS_OSD_RAID_4 = 2,
PNFS_OSD_RAID_5 = 3, PNFS_OSD_RAID_5 = 3,
PNFS_OSD_RAID_PQ = 4 /* Reed-Solomon P+Q */ PNFS_OSD_RAID_PQ = 4 /* Reed-Solomon P+Q */
}; };
skipping to change at page 15, line 13 skipping to change at page 16, line 23
object, the result could include different data in the same ranges of object, the result could include different data in the same ranges of
mirrored tuples, or corrupt parity information. It is the mirrored tuples, or corrupt parity information. It is the
responsibility of the metadata server to enforce serialization responsibility of the metadata server to enforce serialization
requirements such as this. For example, the metadata server may do requirements such as this. For example, the metadata server may do
so by not granting overlapping write layouts within mirrored objects. so by not granting overlapping write layouts within mirrored objects.
4. Object-Based Layout Update 4. Object-Based Layout Update
layoutupdate4 is used in the LAYOUTCOMMIT operation to convey updates layoutupdate4 is used in the LAYOUTCOMMIT operation to convey updates
to the layout and additional information to the metadata server. It to the layout and additional information to the metadata server. It
is defined in the NFSv4.1 draft [5] as follows: is defined in the NFSv4.1 draft [6] as follows:
struct layoutupdate4 { struct layoutupdate4 {
layouttype4 lou_type; layouttype4 lou_type;
opaque lou_body<>; opaque lou_body<>;
}; };
The layoutupdate4 type is an opaque value at the generic pNFS client The layoutupdate4 type is an opaque value at the generic pNFS client
level. If the lou_type layout type is LAYOUT4_OSD2_OBJECTS, then the level. If the lou_type layout type is LAYOUT4_OSD2_OBJECTS, then the
lou_body opaque value is defined by the pnfs_osd_layoutupdate4 type. lou_body opaque value is defined by the pnfs_osd_layoutupdate4 type.
skipping to change at page 15, line 50 skipping to change at page 17, line 19
int64_t delta; /* Bytes consumed by write activity */ int64_t delta; /* Bytes consumed by write activity */
case FALSE: case FALSE:
void; void;
}; };
pnfs_osd_deltaspaceused4 is used to convey space utilization pnfs_osd_deltaspaceused4 is used to convey space utilization
information at the time of LAYOUTCOMMIT. For the file system to information at the time of LAYOUTCOMMIT. For the file system to
properly maintain capacity used information, it needs to track how properly maintain capacity used information, it needs to track how
much capacity was consumed by WRITE operations performed by the much capacity was consumed by WRITE operations performed by the
client. In this protocol, the OSD returns the capacity consumed by a client. In this protocol, the OSD returns the capacity consumed by a
write, which can be different because of internal overhead like write, which can be different than the number of bytes written
block-based allocation and indirect blocks, and the client reflects because of internal overhead like block-based allocation and indirect
this back to the pNFS server so it can accurately track quota. The blocks, and the client reflects this back to the pNFS server so it
pNFS server can choose to trust this information coming from the can accurately track quota. The pNFS server can choose to trust this
clients and therefore avoid querying the OSDs at the time of information coming from the clients and therefore avoid querying the
LAYOUTCOMMIT. If the client is unable to obtain this information OSDs at the time of LAYOUTCOMMIT. If the client is unable to obtain
from the OSD, it simply returns invalid delta_space_used. this information from the OSD, it simply returns invalid
delta_space_used.
4.1.2. pnfs_osd_errno4 4.1.2. pnfs_osd_errno4
enum pnfs_osd_errno4 { enum pnfs_osd_errno4 {
PNFS_OSD_NOT_FOUND = 1, PNFS_OSD_ERR_EIO = 1,
PNFS_OSD_NO_SPACE = 2, PNFS_OSD_ERR_NOT_FOUND = 2,
PNFS_OSD_EIO = 3, PNFS_OSD_ERR_NO_SPACE = 3,
PNFS_OSD_BAD_CRED = 4, PNFS_OSD_ERR_BAD_CRED = 4,
PNFS_OSD_NO_ACCESS = 5, PNFS_OSD_ERR_NO_ACCESS = 5,
PNFS_OSD_UNREACHABLE = 6 PNFS_OSD_ERR_UNREACHABLE = 6,
PNFS_OSD_ERR_RESOURCE = 7
}; };
pnfs_osd_errno4 is used to represent error types when read/write pnfs_osd_errno4 is used to represent error types when read/write
errors are reported to the metadata server. errors are reported to the metadata server. The error codes serve as
hints to the metadata server that may help it in diagnosing the exact
o PNFS_OSD_NOT_FOUND indicates the object ID specifics an object that reason for the error and in repairing it.
does not exist on the Object Storage Device.
o PNFS_OSD_NO_SPACE indicates the operation failed because the Object
Storage Device ran out of free capacity during the operation.
o PNFS_OSD_EIO indicates the operation failed because the Object o PNFS_OSD_ERR_EIO indicates the operation failed because the Object
Storage Device experienced a failure trying to access the object. Storage Device experienced a failure trying to access the object.
The most common source of these errors is media errors, but other The most common source of these errors is media errors, but other
internal errors might cause this. In this case, the metadata server internal errors might cause this. In this case, the metadata
should go examine the broken object more closely. server should go examine the broken object more closely, hence it
should be used as the default error code.
o PNFS_OSD_BAD_CRED indicates the security parameters are not valid. o PNFS_OSD_ERR_NOT_FOUND indicates the object ID specifies an object
The primary cause of this is that the capability has expired, or the that does not exist on the Object Storage Device.
security policy tag (i.e., capability version number) has been
changed to revoke capabilities. The client will need to return the
layout and get a new one with fresh capabilities.
o PNFS_OSD_NO_ACCESS indicates the capability does not allow the o PNFS_OSD_ERR_NO_SPACE indicates the operation failed because the
Object Storage Device ran out of free capacity during the
operation.
o PNFS_OSD_ERR_BAD_CRED indicates the security parameters are not
valid. The primary cause of this is that the capability has
expired, or the access policy tag (a.k.a, capability version
number) has been changed to revoke capabilities. The client will
need to return the layout and get a new one with fresh
capabilities.
o PNFS_OSD_ERR_NO_ACCESS indicates the capability does not allow the
requested operation. This should not occur in normal operation requested operation. This should not occur in normal operation
because the metadata server should give out correct capabilities, or because the metadata server should give out correct capabilities,
none at all. or none at all.
o PNFS_OSD_UNREACHABLE indicates the client was unable to contact the o PNFS_OSD_ERR_UNREACHABLE indicates the client did not complete the
Object Storage Device due to a communication failure. I/O operation at the Object Storage Device due to a communication
failure. Whether the I/O operation was executed by the OSD or not
is undetermined.
o PNFS_OSD_ERR_RESOURCE indicates the client did not issue the I/O
operation due to a local problem on the initiator (i.e. client)
side, e.g., when running out of memory. The client MUST guarantee
that the OSD command was never dispatched to the OSD.
4.1.3. pnfs_osd_ioerr4 4.1.3. pnfs_osd_ioerr4
struct pnfs_osd_ioerr4 { struct pnfs_osd_ioerr4 {
pnfs_osd_objid4 component; pnfs_osd_objid4 component;
length4 offset; length4 comp_offset;
length4 length; length4 comp_length;
bool iswrite; bool iswrite;
pnfs_osd_errno4 errno; pnfs_osd_errno4 errno;
}; };
The pnfs_osd_ioerr4 structure is used to return error indications for The pnfs_osd_ioerr4 structure is used to return error indications for
objects that generated errors during data transfers. These are hints objects that generated errors during data transfers. These are hints
to the metadata server that there are problems with that object. For to the metadata server that there are problems with that object. For
each error, "component", "offset", and "length" represent the object each error, "component", "comp_offset", and "comp_length" represent
and byte range within the component object in which the error the object and byte range within the component object in which the
occurred. "iswrite" is set to "true" if the failed OSD operation was error occurred. "iswrite" is set to "true" if the failed OSD
data modifying, and "errno" represents the type of error. operation was data modifying, and "errno" represents the type of
error.
5. Object-Based Creation Layout Hint 5. Object-Based Creation Layout Hint
The layouthint4 type is defined in the NFSv4.1 draft [5] as follows: The layouthint4 type is defined in the NFSv4.1 draft [6] as follows:
struct layouthint4 { struct layouthint4 {
layouttype4 loh_type; layouttype4 loh_type;
opaque loh_body<>; opaque loh_body<>;
}; };
The layouthint4 structure is used by the client to pass in a hint The layouthint4 structure is used by the client to pass in a hint
about the type of layout it would like created for a particular file. about the type of layout it would like created for a particular file.
If the loh_type layout type is LAYOUT4_OSD2_OBJECTS, then the If the loh_type layout type is LAYOUT4_OSD2_OBJECTS, then the
loh_body opaque value is defined by the pnfs_osd_layouthint4 type. loh_body opaque value is defined by the pnfs_osd_layouthint4 type.
skipping to change at page 20, line 17 skipping to change at page 21, line 47
The object-based metadata server should recall outstanding layouts in The object-based metadata server should recall outstanding layouts in
the following cases: the following cases:
o When the file's security policy changes, i.e. ACLs or permission o When the file's security policy changes, i.e. ACLs or permission
mode bits are set. mode bits are set.
o When the file's aggregation map changes, rendering outstanding o When the file's aggregation map changes, rendering outstanding
layouts invalid. layouts invalid.
o When there are sharing conflicts. For example, the server will o When there are sharing conflicts. For example, the server will
issue stripe aligned layout segments for RAID-5 objects. To prevent issue stripe aligned layout segments for RAID-5 objects. To
corruption of the file's parity, Multiple clients must not hold valid prevent corruption of the file's parity, Multiple clients must not
write layouts for the same stripes. An outstanding RW layout should hold valid write layouts for the same stripes. An outstanding RW
be recalled when a conflicting LAYOUTGET is received from a different layout should be recalled when a conflicting LAYOUTGET is received
client for LAYOUTIOMODE_RW and for a byte-range overlapping with the from a different client for LAYOUTIOMODE4_RW and for a byte-range
outstanding layout segment. overlapping with the outstanding layout segment.
8. Security Considerations 7.1. CB_RECALL_ANY
The metadata server can use the CB_RECALL_ANY callback operation to
notify the client to return some or all of its layouts. The NFSv4.1
draft [6] defines the following types:
const RCA4_TYPE_MASK_OBJ_LAYOUT_MIN = 8;
const RCA4_TYPE_MASK_OBJ_LAYOUT_MAX = 11;
struct CB_RECALL_ANY4args {
uint32_t craa_objects_to_keep;
bitmap4 craa_type_mask;
};
Typically, CB_RECALL_ANY will be used to recall client state when the
server needs to reclaim resources. The craa_type_mask bitmap
specifies the type of resources that are recalled and the
craa_objects_to_keep value specifies how many of the recalled objects
the client is allowed to keep. The object-based layout type mask
flags are defined as follows. They represent the iomode of the
recalled layouts. In response, the client SHOULD return layouts of
the recalled iomode that it needs the least, keeping at most
craa_objects_to_keep object-based layouts.
const PNFS_OSD_RCA4_TYPE_MASK_READ = RCA4_TYPE_MASK_OBJ_LAYOUT_MIN;
const PNFS_OSD_RCA4_TYPE_MASK_RW = RCA4_TYPE_MASK_OBJ_LAYOUT_MIN+1;
const PNFS_OSD_RCA4_TYPE_MASK_ANY = RCA4_TYPE_MASK_OBJ_LAYOUT_MIN+2;
The PNFS_OSD_RCA4_TYPE_MASK_READ flag notifies the client to return
layouts of iomode LAYOUTIOMODE4_READ. Similarly, the
PNFS_OSD_RCA4_TYPE_MASK_RW flag notifies the client to return layouts
of iomode LAYOUTIOMODE4_RW. The PNFS_OSD_RCA4_TYPE_MASK_ANY flag
notifies the client to return layouts of either iomode.
8. Client Fencing
In cases where clients are uncommunicative and their lease has
expired or when clients fail to return recalled layouts in a timely
manner the server MAY revoke client layouts and/or device address
mappings and reassign these resources to other clients. To avoid
data corruption, the metadata server MUST fence off the revoked
clients from the respective objects as described in Section 9.4.
9. Security Considerations
The pNFS extension partitions the NFSv4 file system protocol into two The pNFS extension partitions the NFSv4 file system protocol into two
parts, the control path and the data path (storage protocol). The parts, the control path and the data path (storage protocol). The
control path contains all the new operations described by this control path contains all the new operations described by this
extension; all existing NFSv4 security mechanisms and features apply extension; all existing NFSv4 security mechanisms and features apply
to the control path. The combination of components in a pNFS system to the control path. The combination of components in a pNFS system
is required to preserve the security properties of NFSv4 with respect is required to preserve the security properties of NFSv4 with respect
to an entity accessing data via a client, including security to an entity accessing data via a client, including security
countermeasures to defend against threats that NFSv4 provides countermeasures to defend against threats that NFSv4 provides
defenses for in environments where these threats are considered defenses for in environments where these threats are considered
significant. significant.
The metadata server enforces the file access-control policy at
LAYOUTGET time. The client should use suitable authorization
credentials for getting the layout for the requested iomode (READ or
RW) and the server verifies the permissions and ACL for these
credentials, possibly returning NFS4ERR_ACCESS if the client is not
allowed the requested iomode. If the LAYOUTGET operation succeeds
the client receives, as part of the layout, a set of object
capabilities allowing it I/O access to the specified objects
corresponding to the requested iomode. When the client acts on I/O
operations on behalf of its local users it MUST authenticate and
authorize the user by issuing respective OPEN and ACCESS calls to the
metadata server, similarly to having NFSv4 data delegations. If
access is allowed the client uses the corresponding (READ or RW)
capabilities to perform the I/O operations at the object-storage
devices. When the metadata server receives a request to change
file's permissions or ACL it SHOULD recall all layouts for that file
and it MUST change the capability version attribute on all objects
comprising the file to implicitly invalidate any outstanding
capabilities before committing to the new permissions and ACL. Doing
this will ensure that clients re-authorize their layouts according to
the modified permissions and ACL by requesting new layouts.
Recalling the layouts in this case is courtesy of the server intended
to prevent clients from getting an error on I/Os done after the
capability version changed.
The object storage protocol MUST implement the security aspects The object storage protocol MUST implement the security aspects
described in version 1 of the T10 OSD protocol definition [2]. The described in version 1 of the T10 OSD protocol definition [2]. The
remainder of this section gives an overview of the security mechanism standard defines four security methods: NOSEC, CAPKEY, CMDRSP, and
described in that standard. The goal is to give the reader a basic ALLDATA. To provide minimum level of security allowing verification
understanding of the object security model. Any discrepancies and enforcement of the server access control policy using the layout
between this text and the actual standard are obviously to be security credentials, the NOSEC security method MUST NOT be used for
resolved in favor of the OSD standard. I/O operation. It MAY only be used to get the System ID attribute
when the metadata server provided only the OSD name with the device
address. The remainder of this section gives an overview of the
security mechanism described in that standard. The goal is to give
the reader a basic understanding of the object security model. Any
discrepancies between this text and the actual standard are obviously
to be resolved in favor of the OSD standard.
8.1. OSD Security Data Types 9.1. OSD Security Data Types
There are three main data types associated with object security: a There are three main data types associated with object security: a
capability, a credential, and security parameters. The capability is capability, a credential, and security parameters. The capability is
a set of fields that specifies an object and what operations can be a set of fields that specifies an object and what operations can be
performed on it. A credential is a signed capability. Only a performed on it. A credential is a signed capability. Only a
security manager that knows the secret device keys can correctly sign security manager that knows the secret device keys can correctly sign
a capability to form a valid credential. In pNFS, the file server a capability to form a valid credential. In pNFS, the file server
acts as the security manager and returns signed capabilities (i.e., acts as the security manager and returns signed capabilities (i.e.,
credentials) to the pNFS client. The security parameters are values credentials) to the pNFS client. The security parameters are values
computed by the issuer of OSD commands (i.e., the client) that prove computed by the issuer of OSD commands (i.e., the client) that prove
they hold valid credentials. The client uses the credential as a they hold valid credentials. The client uses the credential as a
signing key to sign the requests it makes to OSD, and puts the signing key to sign the requests it makes to OSD, and puts the
resulting signatures into the security_parameters field of the OSD resulting signatures into the security_parameters field of the OSD
command. The object storage device uses the secret keys it shares command. The object storage device uses the secret keys it shares
with the security manager to validate the signature values in the with the security manager to validate the signature values in the
security parameters. security parameters.
The security types are opaque to the generic layers of the pNFS The security types are opaque to the generic layers of the pNFS
client. The credential is defined as opaque within the client. The credential contents are defined as opaque within the
pnfs_osd_and_cred type. Instead of repeating the definitions here, pnfs_osd_object_cred4 type. Instead of repeating the definitions
the reader is referred to section 4.9.2.2 of the OSD standard. here, the reader is referred to section 4.9.2.2 of the OSD standard.
8.2. The OSD Security Protocol 9.2. The OSD Security Protocol
The object storage protocol relies on a cryptographically secure The object storage protocol relies on a cryptographically secure
capability to control accesses at the object storage devices. capability to control accesses at the object storage devices.
Capabilities are generated by the metadata server, returned to the Capabilities are generated by the metadata server, returned to the
client, and used by the client as described below to authenticate client, and used by the client as described below to authenticate
their requests to the Object Storage Device (OSD). Capabilities their requests to the Object Storage Device (OSD). Capabilities
therefore achieve the required access and open mode checking. They therefore achieve the required access and open mode checking. They
allow the file server to define and check a policy (e.g., open mode) allow the file server to define and check a policy (e.g., open mode)
and the OSD to enforce that policy without knowing the details (e.g., and the OSD to enforce that policy without knowing the details (e.g.,
user IDs and ACLs). user IDs and ACLs).
skipping to change at page 21, line 45 skipping to change at page 25, line 8
permissions. The server SHOULD recall layouts to allow clients to permissions. The server SHOULD recall layouts to allow clients to
gracefully return their capabilities before the access permissions gracefully return their capabilities before the access permissions
change. change.
Each capability is specific to a particular object, an operation on Each capability is specific to a particular object, an operation on
that object, a byte range w/in the object (in OSDv2), and has an that object, a byte range w/in the object (in OSDv2), and has an
explicit expiration time. The capabilities are signed with a secret explicit expiration time. The capabilities are signed with a secret
key that is shared by the object storage devices (OSD) and the key that is shared by the object storage devices (OSD) and the
metadata managers. Clients do not have device keys so they are metadata managers. Clients do not have device keys so they are
unable to forge the signatures in the security parameters. The unable to forge the signatures in the security parameters. The
combination of a capability and its signature is called a combination of a capability, the OSD system id, and a signature is
"credential" in the OSD specification. called a "credential" in the OSD specification.
The details of the security and privacy model for Object Storage are The details of the security and privacy model for Object Storage are
defined in the T10 OSD standard. The following sketch of the defined in the T10 OSD standard. The following sketch of the
algorithm should help the reader understand the basic model. algorithm should help the reader understand the basic model.
LAYOUTGET returns a CapKey, which is also called a credential. It is LAYOUTGET returns a CapKey and a Cap which, together with the OSD
a capability and a signature over that capability. SystemID, are also called a credential. It is a capability and a
signature over that capability and the SystemID. The OSD Standard
refers to the CapKey as the "Credential integrity check value" and to
the ReqMAC as the "Request integrity check value".
CapKey = MAC<SecretKey>(CapArgs) CapKey = MAC<SecretKey>(Cap, SystemID)
Credential = {CapKey, CapArgs} Credential = {Cap, SystemID, CapKey}
The client uses CapKey to sign all the requests it issues for that The client uses CapKey to sign all the requests it issues for that
object using the respective CapArgs. In other words, the CapArgs object using the respective Cap. In other words, the Cap appears in
appears in the request to the storage device, and that request is the request to the storage device, and that request is signed with
signed with the CapKey as follows: the CapKey as follows:
ReqMAC = MAC<CapKey>(Req, Nonceln) ReqMAC = MAC<CapKey>(Req, ReqNonce)
Request = {CapArgs, Req, Nonceln, ReqMAC} Request = {Cap, Req, ReqNonce, ReqMAC}
The following is sent to the OSD: {CapArgs, Req, Nonceln, ReqMAC}. The following is sent to the OSD: {Cap, Req, ReqNonce, ReqMAC}. The
The OSD uses the SecretKey it shares with the metadata server to OSD uses the SecretKey it shares with the metadata server to compare
compare the ReqMAC the client sent with a locally computed value: the ReqMAC the client sent with a locally computed value:
MAC<MAC<SecretKey>(CapArgs)>(Req, Nonceln) LocalCapKey = MAC<SecretKey>(Cap, SystemID)
LocalReqMAC = MAC<LocalCapKey>(Req, ReqNonce)
and if they match the OSD assumes that the capabilities came from an and if they match the OSD assumes that the capabilities came from an
authentic metadata server and allows access to the object, as allowed authentic metadata server and allows access to the object, as allowed
by the CapArgs. Therefore, if the server LAYOUTGET reply, holding by the Cap.
CapKey and CapArgs, is snooped by another client, it can be used to
generate valid OSD requests (within the CapArgs access restriction).
To provide the required privacy requirements for the capabilities 9.3. Protocol Privacy Requirements
returned by LAYOUTGET, the GSS-API can be used, e.g. by using a
session key known to the file server and to the client to encrypt the
whole layout or parts of it. Two general ways to provide privacy in
the absence of GSS-API that are independent of NFSv4 are either an
isolated network such as a VLAN or a secure channel provided by
IPsec.
8.3. Revoking capabilities Note that if the server LAYOUTGET reply, holding CapKey and Cap, is
snooped by another client, it can be used to generate valid OSD
requests (within the Cap access restrictions).
To provide the required privacy requirements for the capability key
returned by LAYOUTGET, the GSS-API can be used, e.g. by using the
RPCSEC_GSS privacy method to send the LAYOUTGET operation or by using
the SSV key to encrypt the capability_key using the GSS_Wrap()
function. Two general ways to provide privacy in the absence of GSS-
API that are independent of NFSv4 are either an isolated network such
as a VLAN or a secure channel provided by IPsec [8].
9.4. Revoking Capabilities
At any time, the metadata server may invalidate all outstanding At any time, the metadata server may invalidate all outstanding
capabilities on an object by changing its capability version capabilities on an object by changing its POLICY ACCESS TAG
attribute. There is also a "fence bit" attribute that the metadata attribute. The value of the POLICY ACCESS TAG is part of a
server can toggle to temporarily block access without permanently capability, and it must match the state of the object attribute. If
revoking capabilities. The value of the fence bit and the capability they do not match, the OSD rejects accesses to the object with the
version are part of a capability, and they must match the state of sense key set to ILLEGAL REQUEST and an additional sense code set to
the attributes. If they do not match, the OSD rejects accesses to INVALID FIELD IN CDB. When a client attempts to use a capability and
the object. When a client attempts to use a capability and discovers is rejected this way, it should issue a LAYOUTCOMMIT for the object
a capability version mismatch, it should issue a LAYOUTRETURN for the and specify PNFS_OSD_BAD_CRED in the ioerr parameter. The client may
object and specify PNFS_OSD_BAD_CRED in the pnfs_osd_ioerr parameter. elect to issue a compound LAYOUTRETURN/LAYOUTGET (or LAYOUTCOMMIT/
The client may elect to issue a compound LAYOUTRETURN/LAYOUTGET (or LAYOUTRETURN/LAYOUTGET) to attempt to fetch a refreshed set of
LAYOUTCOMMIT/LAYOUTRETURN/LAYOUTGET) to attempt to fetch a refreshed capabilities.
set of capabilities.
The metadata server may elect to change the capability version on an The metadata server may elect to change the access policy tag on an
object at any time, for any reason (with the understanding that there object at any time, for any reason (with the understanding that there
is likely an associated performance penalty, especially if there are is likely an associated performance penalty, especially if there are
outstanding layouts for this object). The metadata server MUST outstanding layouts for this object). The metadata server MUST
revoke outstanding capabilities when any one of the following occurs: revoke outstanding capabilities when any one of the following occurs:
(1) the permissions on the object change, (2) a conflicting mandatory
byte-range lock is granted. o the permissions on the object change,
o a conflicting mandatory byte-range lock is granted, or
o a layout is revoked and reassigned to another client
A pNFS client will typically hold one layout for each byte range for A pNFS client will typically hold one layout for each byte range for
either READ or READ/WRITE. It is the pNFS client's responsibility to either READ or READ/WRITE. The client's credentials are checked by
enforce access control among multiple users accessing the same file. the metadata server at LAYOUTGET time and it is the client's
It is neither required nor expected that the pNFS client will obtain responsibility to enforce access control among multiple users
a separate layout for each user accessing a shared object. The accessing the same file. It is neither required nor expected that
client SHOULD use ACCESS calls to check user permissions when the pNFS client will obtain a separate layout for each user accessing
performing I/O so that the server's access control policies are a shared object. The client SHOULD use OPEN and ACCESS calls to
correctly enforced. The result of the ACCESS operation may be cached check user permissions when performing I/O so that the server's
indefinitely, as the server is expected to recall layouts when the access control policies are correctly enforced. The result of the
file's access permissions or ACL change. ACCESS operation may be cached while the client holds a valid layout
as the server is expected to recall layouts when the file's access
permissions or ACL change.
9. IANA Considerations 10. IANA Considerations
As described in the NFSv4.1 draft [5], new layout type numbers will As described in the NFSv4.1 draft [6], new layout type numbers will
be requested from IANA. This document defines the protocol be requested from IANA. This document defines the protocol
associated with the existing layout type number, associated with the existing layout type number,
LAYOUT4_OSD2_OBJECTS, and it requires no further actions for IANA. LAYOUT4_OSD2_OBJECTS, and it requires no further actions for IANA.
10. References 11. References
10.1. Normative References 11.1. Normative References
[1] Bradner, S., "Key words for use in RFCs to Indicate Requirement [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement
Levels", RFC 2119, March 1997. Levels", RFC 2119, March 1997.
[2] Weber, R., "SCSI Object-Based Storage Device Commands", [2] Weber, R., "SCSI Object-Based Storage Device Commands",
July 2004, <http://www.t10.org/ftp/t10/drafts/osd/osd-r10.pdf>. July 2004, <http://www.t10.org/ftp/t10/drafts/osd/osd-r10.pdf>.
[3] Eisler, M., "XDR: External Data Representation Standard", [3] Linn, J., "Generic Security Service Application Program
Interface Version 2, Update 1", RFC 2743, January 2000.
[4] Eisler, M., "XDR: External Data Representation Standard",
STD 67, RFC 4506, May 2006. STD 67, RFC 4506, May 2006.
[4] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame, [5] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame,
C., Eisler, M., and D. Noveck, "Network File System (NFS) C., Eisler, M., and D. Noveck, "Network File System (NFS)
version 4 Protocol", RFC 3530, April 2003. version 4 Protocol", RFC 3530, April 2003.
10.2. Informative References 11.2. Informative References
[5] Shepler, S., Eisler, M., and D. Noveck, "NFSv4 Minor Version 1", [6] Shepler, S., Eisler, M., and D. Noveck, "NFSv4 Minor Version 1",
March 2007, <http://www.ietf.org/internet-drafts/ March 2007, <http://www.ietf.org/internet-drafts/
draft-ietf-nfsv4-minorversion1-10.txt>. draft-ietf-nfsv4-minorversion1-13.txt>.
[6] Weber, R., "SCSI Object-Based Storage Device Commands -2 [7] Weber, R., "SCSI Object-Based Storage Device Commands -2
(OSD-2)", January 2007, (OSD-2)", January 2007,
<http://www.t10.org/ftp/t10/drafts/osd2/osd2r01.pdf>. <http://www.t10.org/ftp/t10/drafts/osd2/osd2r02.pdf>.
[8] Kent, S. and K. Seo, "Security Architecture for the Internet
Protocol", RFC 4301, December 2005.
Appendix A. Acknowledgments Appendix A. Acknowledgments
Todd Pisek was a co-editor of the initial drafts for this document. Todd Pisek was a co-editor of the initial drafts for this document.
Authors' Addresses Authors' Addresses
Benny Halevy Benny Halevy
Panasas, Inc. Panasas, Inc.
1501 Reedsdale St. Suite 400 1501 Reedsdale St. Suite 400
 End of changes. 64 change blocks. 
182 lines changed or deleted 367 lines changed or added

This html diff was produced by rfcdiff 1.34. The latest version is available from http://tools.ietf.org/tools/rfcdiff/