draft-ietf-nfsv4-scsi-layout-09.txt   draft-ietf-nfsv4-scsi-layout-10.txt 
NFSv4 C. Hellwig NFSv4 C. Hellwig
Internet-Draft Internet-Draft
Intended status: Standards Track September 06, 2016 Intended status: Standards Track December 05, 2016
Expires: March 10, 2017 Expires: June 8, 2017
Parallel NFS (pNFS) SCSI Layout Parallel NFS (pNFS) SCSI Layout
draft-ietf-nfsv4-scsi-layout-09.txt draft-ietf-nfsv4-scsi-layout-10.txt
Abstract Abstract
The Parallel Network File System (pNFS) allows a separation between The Parallel Network File System (pNFS) allows a separation between
the metadata (onto a metadata server) and data (onto a storage the metadata (onto a metadata server) and data (onto a storage
device) for a file. The SCSI Layout Type is defined in this document device) for a file. The SCSI Layout Type is defined in this document
as an extension to pNFS to allow the use SCSI based block storage as an extension to pNFS to allow the use SCSI based block storage
devices. devices.
Status of This Memo Status of This Memo
skipping to change at page 1, line 34 skipping to change at page 1, line 34
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on March 10, 2017. This Internet-Draft will expire on June 8, 2017.
Copyright Notice Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 42 skipping to change at page 2, line 42
2.8. Volatile write caches . . . . . . . . . . . . . . . . . . 23 2.8. Volatile write caches . . . . . . . . . . . . . . . . . . 23
3. Enforcing NFSv4 Semantics . . . . . . . . . . . . . . . . . . 24 3. Enforcing NFSv4 Semantics . . . . . . . . . . . . . . . . . . 24
3.1. Use of Open Stateids . . . . . . . . . . . . . . . . . . 24 3.1. Use of Open Stateids . . . . . . . . . . . . . . . . . . 24
3.2. Enforcing Security Restrictions . . . . . . . . . . . . . 25 3.2. Enforcing Security Restrictions . . . . . . . . . . . . . 25
3.3. Enforcing Locking Restrictions . . . . . . . . . . . . . 25 3.3. Enforcing Locking Restrictions . . . . . . . . . . . . . 25
4. Security Considerations . . . . . . . . . . . . . . . . . . . 26 4. Security Considerations . . . . . . . . . . . . . . . . . . . 26
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27
6. Normative References . . . . . . . . . . . . . . . . . . . . 27 6. Normative References . . . . . . . . . . . . . . . . . . . . 27
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 28 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 28
Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 28 Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 28
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 28 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 29
1. Introduction 1. Introduction
Figure 1 shows the overall architecture of a Parallel NFS (pNFS) Figure 1 shows the overall architecture of a Parallel NFS (pNFS)
system: system:
+-----------+ +-----------+
|+-----------+ +-----------+ |+-----------+ +-----------+
||+-----------+ | | ||+-----------+ | |
||| | NFSv4.1 + pNFS | | ||| | NFSv4.1 + pNFS | |
skipping to change at page 9, line 23 skipping to change at page 9, line 23
combinations of "sbv_code_set" and "sbv_designator_type" are valid, combinations of "sbv_code_set" and "sbv_designator_type" are valid,
please refer to [SPC4] for details, and note that ASCII MAY be used please refer to [SPC4] for details, and note that ASCII MAY be used
as the code set for UTF-8 text that contains only printable ASCII as the code set for UTF-8 text that contains only printable ASCII
characters. Note that a Device Identification VPD page MAY contain characters. Note that a Device Identification VPD page MAY contain
multiple descriptors with the same association, code set and multiple descriptors with the same association, code set and
designator type. NFS clients thus MUST check all the descriptors for designator type. NFS clients thus MUST check all the descriptors for
a possible match to "sbv_code_set", "sbv_designator_type" and a possible match to "sbv_code_set", "sbv_designator_type" and
"sbv_designator". "sbv_designator".
Storage devices such as storage arrays can have multiple physical Storage devices such as storage arrays can have multiple physical
network ports that need not be connected to a common network, network interfaces that need not be connected to a common network,
resulting in a pNFS client having simultaneous multipath access to resulting in a pNFS client having simultaneous multipath access to
the same storage volumes via different ports on different networks. the same storage volumes via different ports on different networks.
Selection of one or multiple ports to access the storage device is Selection of one or multiple ports to access the storage device is
left up to the client. left up to the client.
Additionally the server returns a Persistent Reservation key in the Additionally the server returns a Persistent Reservation key in the
"sbv_pr_key" field. See Section 2.4.10 for more details on the use "sbv_pr_key" field. See Section 2.4.10 for more details on the use
of Persistent Reservations. of Persistent Reservations.
2.3.2. Volume Topology 2.3.2. Volume Topology
skipping to change at page 11, line 49 skipping to change at page 11, line 49
element of the array. Concat, slice, and stripe volumes MUST refer element of the array. Concat, slice, and stripe volumes MUST refer
to volumes defined by lower indexed elements of the array. to volumes defined by lower indexed elements of the array.
The "pnfs_scsi_device_addr4" data structure is returned by the server The "pnfs_scsi_device_addr4" data structure is returned by the server
as the storage-protocol-specific opaque field da_addr_body in the as the storage-protocol-specific opaque field da_addr_body in the
"device_addr4" structure by a successful GETDEVICEINFO operation "device_addr4" structure by a successful GETDEVICEINFO operation
[RFC5661]. [RFC5661].
As noted above, all device_addr4 structures eventually resolve to a As noted above, all device_addr4 structures eventually resolve to a
set of volumes of type PNFS_SCSI_VOLUME_BASE. Complicated volume set of volumes of type PNFS_SCSI_VOLUME_BASE. Complicated volume
hierarchies MAY be composed of dozens of volumes each with several hierarchies may be composed of dozens of volumes each with several
components; thus, the device address MAY require several kilobytes. components; thus, the device address may require several kilobytes.
The client SHOULD be prepared to allocate a large buffer to contain The client SHOULD be prepared to allocate a large buffer to contain
the result. In the case of the server returning NFS4ERR_TOOSMALL, the result. In the case of the server returning NFS4ERR_TOOSMALL,
the client SHOULD allocate a buffer of at least gdir_mincount_bytes the client SHOULD allocate a buffer of at least gdir_mincount_bytes
to contain the expected result and retry the GETDEVICEINFO request. to contain the expected result and retry the GETDEVICEINFO request.
2.4. Data Structures: Extents and Extent Lists 2.4. Data Structures: Extents and Extent Lists
A pNFS SCSI layout is a list of extents within a flat array of data A pNFS SCSI layout is a list of extents within a flat array of data
blocks in a volume. The details of the volume topology can be blocks in a volume. The details of the volume topology can be
determined by using the GETDEVICEINFO operation. The SCSI layout determined by using the GETDEVICEINFO operation. The SCSI layout
skipping to change at page 13, line 47 skipping to change at page 13, line 47
LU. The se_file_offset, se_length, and se_state fields for an extent LU. The se_file_offset, se_length, and se_state fields for an extent
returned from the server are valid for all extents. In contrast, the returned from the server are valid for all extents. In contrast, the
interpretation of the se_storage_offset field depends on the value of interpretation of the se_storage_offset field depends on the value of
se_state as follows (in increasing order): se_state as follows (in increasing order):
PNFS_SCSI_READ_WRITE_DATA means that se_storage_offset is valid, and PNFS_SCSI_READ_WRITE_DATA means that se_storage_offset is valid, and
points to valid/initialized data that can be read and written. points to valid/initialized data that can be read and written.
PNFS_SCSI_READ_DATA means that se_storage_offset is valid and points PNFS_SCSI_READ_DATA means that se_storage_offset is valid and points
to valid/initialized data that can only be read. Write operations to valid/initialized data that can only be read. Write operations
are prohibited; the client MAY need to request a read-write are prohibited.
layout.
PNFS_SCSI_INVALID_DATA means that se_storage_offset is valid, but PNFS_SCSI_INVALID_DATA means that se_storage_offset is valid, but
points to invalid un-initialized data. This data MUST not be read points to invalid un-initialized data. This data MUST not be read
from the disk until it has been initialized. A read request for a from the disk until it has been initialized. A read request for a
PNFS_SCSI_INVALID_DATA extent MUST fill the user buffer with PNFS_SCSI_INVALID_DATA extent MUST fill the user buffer with
zeros, unless the extent is covered by a PNFS_SCSI_READ_DATA zeros, unless the extent is covered by a PNFS_SCSI_READ_DATA
extent of a copy-on-write file system. Write requests MUST write extent of a copy-on-write file system. Write requests MUST write
whole server-sized blocks to the disk; bytes not initialized by whole server-sized blocks to the disk; bytes not initialized by
the user MUST be set to zero. Any write to storage in a the user MUST be set to zero. Any write to storage in a
PNFS_SCSI_INVALID_DATA extent changes the written portion of the PNFS_SCSI_INVALID_DATA extent changes the written portion of the
skipping to change at page 23, line 19 skipping to change at page 23, line 19
conditions that are unlikely to be resolved soon. conditions that are unlikely to be resolved soon.
The error NFS4ERR_RECALLCONFLICT indicates that the server has The error NFS4ERR_RECALLCONFLICT indicates that the server has
recently issued a CB_LAYOUTRECALL to the requesting client, making it recently issued a CB_LAYOUTRECALL to the requesting client, making it
necessary for the client to respond to the recall before processing necessary for the client to respond to the recall before processing
the layout request. A client can wait for that recall to be receive the layout request. A client can wait for that recall to be receive
and processe or it can retry as for NFS4ERR_TRYLATER, as described and processe or it can retry as for NFS4ERR_TRYLATER, as described
below. below.
The error NFS4ERR_TRYLATER is used to indicate that the server cannot The error NFS4ERR_TRYLATER is used to indicate that the server cannot
immediately grant the layout to the client. This MAY be due to immediately grant the layout to the client. This may be due to
constraints on writable sharing of blocks by multiple clients or to a constraints on writable sharing of blocks by multiple clients or to a
conflict with a recallable lock (e.g. a delegation). In either case, conflict with a recallable lock (e.g. a delegation). In either case,
a reasonable approach for the client is to wait several milliseconds a reasonable approach for the client is to wait several milliseconds
and retry the request. The client SHOULD track the number of and retry the request. The client SHOULD track the number of
retries, and if forward progress is not made, the client SHOULD retries, and if forward progress is not made, the client SHOULD
abandon the attempt to get a layout and perform READ and WRITE abandon the attempt to get a layout and perform READ and WRITE
operations by sending them to the server operations by sending them to the server
The error NFS4ERR_LAYOUTUNAVAILABLE MAY be returned by the server if The error NFS4ERR_LAYOUTUNAVAILABLE MAY be returned by the server if
layouts are not supported for the requested file or its containing layouts are not supported for the requested file or its containing
skipping to change at page 27, line 5 skipping to change at page 27, line 5
attached SCSI ([SAS3]) that provide essentially no security attached SCSI ([SAS3]) that provide essentially no security
functionality. At the other extreme, pNFS may be used with storage functionality. At the other extreme, pNFS may be used with storage
protocols such as iSCSI ([RFC7143]) that can provide significant protocols such as iSCSI ([RFC7143]) that can provide significant
security functionality. It is the responsibility of those security functionality. It is the responsibility of those
administering and deploying pNFS with a SCSI storage access protocol administering and deploying pNFS with a SCSI storage access protocol
to ensure that appropriate protection is provided to that protocol to ensure that appropriate protection is provided to that protocol
(physical security is a common means for protocols not based on IP). (physical security is a common means for protocols not based on IP).
In environments where the security requirements for the storage In environments where the security requirements for the storage
protocol cannot be met, pNFS SCSI layouts SHOULD NOT be used. protocol cannot be met, pNFS SCSI layouts SHOULD NOT be used.
When using IP-based storage protocols such as iSCSI, IPSEC should be
used as outlined in [RFC3723] and updated in [RFC7146].
When security is available for a storage protocol, it is generally at When security is available for a storage protocol, it is generally at
a different granularity and with a different notion of identity than a different granularity and with a different notion of identity than
NFSv4 (e.g., NFSv4 controls user access to files, iSCSI controls NFSv4 (e.g., NFSv4 controls user access to files, iSCSI controls
initiator access to volumes). The responsibility for enforcing initiator access to volumes). The responsibility for enforcing
appropriate correspondences between these security layers is placed appropriate correspondences between these security layers is placed
upon the pNFS client. As with the issues in the first paragraph of upon the pNFS client. As with the issues in the first paragraph of
this section, in environments where the security requirements are this section, in environments where the security requirements are
such that client-side protection from access to storage outside of such that client-side protection from access to storage outside of
the layout is not sufficient, pNFS SCSI layouts SHOULD NOT be used. the layout is not sufficient, pNFS SCSI layouts SHOULD NOT be used.
skipping to change at page 27, line 31 skipping to change at page 27, line 34
6. Normative References 6. Normative References
[LEGAL] IETF Trust, "Legal Provisions Relating to IETF Documents", [LEGAL] IETF Trust, "Legal Provisions Relating to IETF Documents",
November 2008, <http://trustee.ietf.org/docs/ November 2008, <http://trustee.ietf.org/docs/
IETF-Trust-License-Policy.pdf>. IETF-Trust-License-Policy.pdf>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", March 1997. Requirement Levels", March 1997.
[RFC3723] Aboba, B., Tseng, J., Walker, J., Rangan, V., and F.
Travostino, "Securing Block Storage Protocols over IP",
RFC 3723, Apr 2004.
[RFC4506] Eisler, M., "XDR: External Data Representation Standard", [RFC4506] Eisler, M., "XDR: External Data Representation Standard",
STD 67, RFC 4506, May 2006. STD 67, RFC 4506, May 2006.
[RFC5661] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., [RFC5661] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed.,
"Network File System (NFS) Version 4 Minor Version 1 "Network File System (NFS) Version 4 Minor Version 1
Protocol", RFC 5661, January 2010. Protocol", RFC 5661, January 2010.
[RFC5662] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., [RFC5662] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed.,
"Network File System (NFS) Version 4 Minor Version 1 "Network File System (NFS) Version 4 Minor Version 1
External Data Representation Standard (XDR) Description", External Data Representation Standard (XDR) Description",
skipping to change at page 28, line 5 skipping to change at page 28, line 12
"Parallel NFS (pNFS) Block/Volume Layout", RFC 5663, "Parallel NFS (pNFS) Block/Volume Layout", RFC 5663,
January 2010. January 2010.
[RFC6688] Black, D., Ed., Glasgow, J., and S. Faibish, "Parallel NFS [RFC6688] Black, D., Ed., Glasgow, J., and S. Faibish, "Parallel NFS
(pNFS) Block Disk Protection", RFC 6688, July 2012. (pNFS) Block Disk Protection", RFC 6688, July 2012.
[RFC7143] Chadalapaka, M., Meth, K., and D. Black, "Internet Small [RFC7143] Chadalapaka, M., Meth, K., and D. Black, "Internet Small
Computer System Interface (iSCSI) Protocol Computer System Interface (iSCSI) Protocol
(Consolidated)", RFC RFC7143, April 2014. (Consolidated)", RFC RFC7143, April 2014.
[RFC7146] Black, D. and P. Koning, "Securing Block Storage Protocols
over IP: RFC 3723 Requirements Update for IPsec v3", RFC
RFC7146, April 2014.
[SAM-5] INCITS Technical Committee T10, "SCSI Architecture Model - [SAM-5] INCITS Technical Committee T10, "SCSI Architecture Model -
5 (SAM-5)", ANSI INCITS 515-XXXXX, 2016. 5 (SAM-5)", ANSI INCITS 515-2016, 2016.
[SAS3] INCITS Technical Committee T10, "Serial Attached Scsi-3", [SAS3] INCITS Technical Committee T10, "Serial Attached Scsi-3",
ANSI INCITS ANSI INCITS 519-2014, ISO/IEC 14776-154, 2014. ANSI INCITS ANSI INCITS 519-2014, ISO/IEC 14776-154, 2014.
[SBC3] INCITS Technical Committee T10, "SCSI Block Commands-3", [SBC3] INCITS Technical Committee T10, "SCSI Block Commands-3",
ANSI INCITS INCITS 514-2014, ISO/IEC 14776-323, 2014. ANSI INCITS INCITS 514-2014, ISO/IEC 14776-323, 2014.
[SPC4] INCITS Technical Committee T10, "SCSI Primary Commands-4", [SPC4] INCITS Technical Committee T10, "SCSI Primary Commands-4",
ANSI INCITS 513-2015, 2015. ANSI INCITS 513-2015, 2015.
 End of changes. 12 change blocks. 
12 lines changed or deleted 22 lines changed or added

This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/