draft-ietf-nfsv4-minorversion1-24.txt   draft-ietf-nfsv4-minorversion1-25.txt 
NFSv4 S. Shepler NFSv4 S. Shepler
Internet-Draft M. Eisler Internet-Draft M. Eisler
Intended status: Standards Track D. Noveck Intended status: Standards Track D. Noveck
Expires: February 7, 2009 Editors Expires: February 20, 2009 Editors
Aug 06, 2008 August 19, 2008
NFS Version 4 Minor Version 1 NFS Version 4 Minor Version 1
draft-ietf-nfsv4-minorversion1-24.txt draft-ietf-nfsv4-minorversion1-25.txt
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 35 skipping to change at page 1, line 35
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on February 7, 2009. This Internet-Draft will expire on February 20, 2009.
Abstract Abstract
This Internet-Draft describes NFS version 4 minor version one, This Internet-Draft describes NFS version 4 minor version one,
including features retained from the base protocol and protocol including features retained from the base protocol and protocol
extensions made subsequently. Major extensions introduced in NFS extensions made subsequently. Major extensions introduced in NFS
version 4 minor version one include: Sessions, Directory Delegations, version 4 minor version one include: Sessions, Directory Delegations,
and parallel NFS (pNFS). and parallel NFS (pNFS).
Requirements Language Requirements Language
skipping to change at page 2, line 18 skipping to change at page 2, line 18
1.1. The NFS Version 4 Minor Version 1 Protocol . . . . . . . 11 1.1. The NFS Version 4 Minor Version 1 Protocol . . . . . . . 11
1.2. Scope of this Document . . . . . . . . . . . . . . . . . 11 1.2. Scope of this Document . . . . . . . . . . . . . . . . . 11
1.3. NFSv4 Goals . . . . . . . . . . . . . . . . . . . . . . 11 1.3. NFSv4 Goals . . . . . . . . . . . . . . . . . . . . . . 11
1.4. NFSv4.1 Goals . . . . . . . . . . . . . . . . . . . . . 12 1.4. NFSv4.1 Goals . . . . . . . . . . . . . . . . . . . . . 12
1.5. General Definitions . . . . . . . . . . . . . . . . . . 12 1.5. General Definitions . . . . . . . . . . . . . . . . . . 12
1.6. Overview of NFSv4.1 Features . . . . . . . . . . . . . . 15 1.6. Overview of NFSv4.1 Features . . . . . . . . . . . . . . 15
1.6.1. RPC and Security . . . . . . . . . . . . . . . . . . 15 1.6.1. RPC and Security . . . . . . . . . . . . . . . . . . 15
1.6.2. Protocol Structure . . . . . . . . . . . . . . . . . 15 1.6.2. Protocol Structure . . . . . . . . . . . . . . . . . 15
1.6.3. File System Model . . . . . . . . . . . . . . . . . 16 1.6.3. File System Model . . . . . . . . . . . . . . . . . 16
1.6.4. Locking Facilities . . . . . . . . . . . . . . . . . 18 1.6.4. Locking Facilities . . . . . . . . . . . . . . . . . 18
1.7. Differences from NFSv4.0 . . . . . . . . . . . . . . . . 18 1.7. Differences from NFSv4.0 . . . . . . . . . . . . . . . . 19
2. Core Infrastructure . . . . . . . . . . . . . . . . . . . . . 19 2. Core Infrastructure . . . . . . . . . . . . . . . . . . . . . 20
2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 20 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 20
2.2. RPC and XDR . . . . . . . . . . . . . . . . . . . . . . 20 2.2. RPC and XDR . . . . . . . . . . . . . . . . . . . . . . 20
2.2.1. RPC-based Security . . . . . . . . . . . . . . . . . 20 2.2.1. RPC-based Security . . . . . . . . . . . . . . . . . 20
2.3. COMPOUND and CB_COMPOUND . . . . . . . . . . . . . . . . 23 2.3. COMPOUND and CB_COMPOUND . . . . . . . . . . . . . . . . 23
2.4. Client Identifiers and Client Owners . . . . . . . . . . 24 2.4. Client Identifiers and Client Owners . . . . . . . . . . 24
2.4.1. Upgrade from NFSv4.0 to NFSv4.1 . . . . . . . . . . 27 2.4.1. Upgrade from NFSv4.0 to NFSv4.1 . . . . . . . . . . 27
2.4.2. Server Release of Client ID . . . . . . . . . . . . 28 2.4.2. Server Release of Client ID . . . . . . . . . . . . 28
2.4.3. Resolving Client Owner Conflicts . . . . . . . . . . 28 2.4.3. Resolving Client Owner Conflicts . . . . . . . . . . 28
2.5. Server Owners . . . . . . . . . . . . . . . . . . . . . 29 2.5. Server Owners . . . . . . . . . . . . . . . . . . . . . 29
2.6. Security Service Negotiation . . . . . . . . . . . . . . 30 2.6. Security Service Negotiation . . . . . . . . . . . . . . 30
2.6.1. NFSv4.1 Security Tuples . . . . . . . . . . . . . . 30 2.6.1. NFSv4.1 Security Tuples . . . . . . . . . . . . . . 30
2.6.2. SECINFO and SECINFO_NO_NAME . . . . . . . . . . . . 30 2.6.2. SECINFO and SECINFO_NO_NAME . . . . . . . . . . . . 31
2.6.3. Security Error . . . . . . . . . . . . . . . . . . . 31 2.6.3. Security Error . . . . . . . . . . . . . . . . . . . 31
2.7. Minor Versioning . . . . . . . . . . . . . . . . . . . . 35 2.7. Minor Versioning . . . . . . . . . . . . . . . . . . . . 35
2.8. Non-RPC-based Security Services . . . . . . . . . . . . 38 2.8. Non-RPC-based Security Services . . . . . . . . . . . . 38
2.8.1. Authorization . . . . . . . . . . . . . . . . . . . 38 2.8.1. Authorization . . . . . . . . . . . . . . . . . . . 38
2.8.2. Auditing . . . . . . . . . . . . . . . . . . . . . . 38 2.8.2. Auditing . . . . . . . . . . . . . . . . . . . . . . 38
2.8.3. Intrusion Detection . . . . . . . . . . . . . . . . 38 2.8.3. Intrusion Detection . . . . . . . . . . . . . . . . 38
2.9. Transport Layers . . . . . . . . . . . . . . . . . . . . 38 2.9. Transport Layers . . . . . . . . . . . . . . . . . . . . 39
2.9.1. REQUIRED and RECOMMENDED Properties of Transports . 38 2.9.1. REQUIRED and RECOMMENDED Properties of Transports . 39
2.9.2. Client and Server Transport Behavior . . . . . . . . 39 2.9.2. Client and Server Transport Behavior . . . . . . . . 39
2.9.3. Ports . . . . . . . . . . . . . . . . . . . . . . . 41 2.9.3. Ports . . . . . . . . . . . . . . . . . . . . . . . 41
2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 41 2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 41
2.10.1. Motivation and Overview . . . . . . . . . . . . . . 41 2.10.1. Motivation and Overview . . . . . . . . . . . . . . 41
2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 42 2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 42
2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 44 2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 44
2.10.4. Trunking . . . . . . . . . . . . . . . . . . . . . . 45 2.10.4. Trunking . . . . . . . . . . . . . . . . . . . . . . 45
2.10.5. Exactly Once Semantics . . . . . . . . . . . . . . . 48 2.10.5. Exactly Once Semantics . . . . . . . . . . . . . . . 48
2.10.6. RDMA Considerations . . . . . . . . . . . . . . . . 61 2.10.6. RDMA Considerations . . . . . . . . . . . . . . . . 61
2.10.7. Sessions Security . . . . . . . . . . . . . . . . . 63 2.10.7. Sessions Security . . . . . . . . . . . . . . . . . 64
2.10.8. The SSV GSS Mechanism . . . . . . . . . . . . . . . 69 2.10.8. The SSV GSS Mechanism . . . . . . . . . . . . . . . 69
2.10.9. Session Mechanics - Steady State . . . . . . . . . . 73 2.10.9. Session Mechanics - Steady State . . . . . . . . . . 73
2.10.10. Session Inactivity Timer . . . . . . . . . . . . . . 75 2.10.10. Session Inactivity Timer . . . . . . . . . . . . . . 75
2.10.11. Session Mechanics - Recovery . . . . . . . . . . . . 75 2.10.11. Session Mechanics - Recovery . . . . . . . . . . . . 75
2.10.12. Parallel NFS and Sessions . . . . . . . . . . . . . 78 2.10.12. Parallel NFS and Sessions . . . . . . . . . . . . . 79
3. Protocol Constants and Data Types . . . . . . . . . . . . . . 78 3. Protocol Constants and Data Types . . . . . . . . . . . . . . 79
3.1. Basic Constants . . . . . . . . . . . . . . . . . . . . 79 3.1. Basic Constants . . . . . . . . . . . . . . . . . . . . 79
3.2. Basic Data Types . . . . . . . . . . . . . . . . . . . . 79 3.2. Basic Data Types . . . . . . . . . . . . . . . . . . . . 80
3.3. Structured Data Types . . . . . . . . . . . . . . . . . 81 3.3. Structured Data Types . . . . . . . . . . . . . . . . . 82
4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 90 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 90 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 90
4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 91 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 91
4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 91 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 91
4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 91 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 91
4.2.1. General Properties of a Filehandle . . . . . . . . . 92 4.2.1. General Properties of a Filehandle . . . . . . . . . 92
4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 93 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 93
4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 93 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 93
4.3. One Method of Constructing a Volatile Filehandle . . . . 94 4.3. One Method of Constructing a Volatile Filehandle . . . . 94
4.4. Client Recovery from Filehandle Expiration . . . . . . . 95 4.4. Client Recovery from Filehandle Expiration . . . . . . . 95
skipping to change at page 4, line 43 skipping to change at page 4, line 43
9. File Locking and Share Reservations . . . . . . . . . . . . . 174 9. File Locking and Share Reservations . . . . . . . . . . . . . 174
9.1. Opens and Byte-Range Locks . . . . . . . . . . . . . . . 174 9.1. Opens and Byte-Range Locks . . . . . . . . . . . . . . . 174
9.1.1. State-owner Definition . . . . . . . . . . . . . . . 174 9.1.1. State-owner Definition . . . . . . . . . . . . . . . 174
9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 175 9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 175
9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 178 9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 178
9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 178 9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 178
9.4. Stateid Seqid Values and Byte-Range Locks . . . . . . . 179 9.4. Stateid Seqid Values and Byte-Range Locks . . . . . . . 179
9.5. Issues with Multiple Open-Owners . . . . . . . . . . . . 179 9.5. Issues with Multiple Open-Owners . . . . . . . . . . . . 179
9.6. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 180 9.6. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 180
9.7. Share Reservations . . . . . . . . . . . . . . . . . . . 181 9.7. Share Reservations . . . . . . . . . . . . . . . . . . . 181
9.8. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 181 9.8. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 182
9.9. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 182 9.9. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 182
9.10. Parallel OPENs . . . . . . . . . . . . . . . . . . . . . 183 9.10. Parallel OPENs . . . . . . . . . . . . . . . . . . . . . 183
9.11. Reclaim of Open and Byte-Range Locks . . . . . . . . . . 184 9.11. Reclaim of Open and Byte-Range Locks . . . . . . . . . . 184
10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 184 10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 184
10.1. Performance Challenges for Client-Side Caching . . . . . 185 10.1. Performance Challenges for Client-Side Caching . . . . . 185
10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 186 10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 186
10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 188 10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 188
10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 190 10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 190
10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 190 10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 190
10.3.2. Data Caching and File Locking . . . . . . . . . . . 191 10.3.2. Data Caching and File Locking . . . . . . . . . . . 191
skipping to change at page 10, line 6 skipping to change at page 10, line 6
Delegation Wants . . . . . . . . . . . . . . . . . . . . 576 Delegation Wants . . . . . . . . . . . . . . . . . . . . 576
20.11. Operation 13: CB_NOTIFY_LOCK - Notify of possible 20.11. Operation 13: CB_NOTIFY_LOCK - Notify of possible
lock availability . . . . . . . . . . . . . . . . . . . 577 lock availability . . . . . . . . . . . . . . . . . . . 577
20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify device ID 20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify device ID
changes . . . . . . . . . . . . . . . . . . . . . . . . 579 changes . . . . . . . . . . . . . . . . . . . . . . . . 579
20.13. Operation 10044: CB_ILLEGAL - Illegal Callback 20.13. Operation 10044: CB_ILLEGAL - Illegal Callback
Operation . . . . . . . . . . . . . . . . . . . . . . . 581 Operation . . . . . . . . . . . . . . . . . . . . . . . 581
21. Security Considerations . . . . . . . . . . . . . . . . . . . 581 21. Security Considerations . . . . . . . . . . . . . . . . . . . 581
22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 583 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 583
22.1. Named Attribute Definitions . . . . . . . . . . . . . . 583 22.1. Named Attribute Definitions . . . . . . . . . . . . . . 583
22.2. ONC RPC Network Identifiers (netids) . . . . . . . . . . 583 22.1.1. Initial Registry . . . . . . . . . . . . . . . . . . 584
22.3. Defining New Notifications . . . . . . . . . . . . . . . 584 22.1.2. Updating Registrations . . . . . . . . . . . . . . . 584
22.4. Defining New Layout Types . . . . . . . . . . . . . . . 584 22.2. Device ID Notifications . . . . . . . . . . . . . . . . 584
22.5. Path Variable Definitions . . . . . . . . . . . . . . . 586 22.2.1. Initial Registry . . . . . . . . . . . . . . . . . . 585
22.5.1. Path Variable Values . . . . . . . . . . . . . . . . 586 22.2.2. Updating Registrations . . . . . . . . . . . . . . . 585
22.5.2. Path Variable Names . . . . . . . . . . . . . . . . 586 22.3. Object Recall Types . . . . . . . . . . . . . . . . . . 585
23. References . . . . . . . . . . . . . . . . . . . . . . . . . 586 22.3.1. Initial Registry . . . . . . . . . . . . . . . . . . 587
23.1. Normative References . . . . . . . . . . . . . . . . . . 586 22.3.2. Updating Registrations . . . . . . . . . . . . . . . 587
23.2. Informative References . . . . . . . . . . . . . . . . . 588 22.4. Layout Types . . . . . . . . . . . . . . . . . . . . . . 587
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 590 22.4.1. Initial Registry . . . . . . . . . . . . . . . . . . 588
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 592 22.4.2. Updating Registrations . . . . . . . . . . . . . . . 588
Intellectual Property and Copyright Statements . . . . . . . . . 593 22.4.3. Guidelines for Writing Layout Type Specifications . 588
22.5. Path Variable Definitions . . . . . . . . . . . . . . . 590
22.5.1. Path Variables Registry . . . . . . . . . . . . . . 590
22.5.2. Values for the ${ietf.org:CPU_ARCH} Variable . . . . 592
22.5.3. Values for the ${ietf.org:OS_TYPE} Variable . . . . 592
23. References . . . . . . . . . . . . . . . . . . . . . . . . . 593
23.1. Normative References . . . . . . . . . . . . . . . . . . 593
23.2. Informative References . . . . . . . . . . . . . . . . . 595
Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 596
Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 598
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 599
Intellectual Property and Copyright Statements . . . . . . . . . 600
1. Introduction 1. Introduction
1.1. The NFS Version 4 Minor Version 1 Protocol 1.1. The NFS Version 4 Minor Version 1 Protocol
The NFS version 4 minor version 1 (NFSv4.1) protocol is the second The NFS version 4 minor version 1 (NFSv4.1) protocol is the second
minor version of the NFS version 4 (NFSv4) protocol. The first minor minor version of the NFS version 4 (NFSv4) protocol. The first minor
version, NFSv4.0 is described in [21]. It generally follows the version, NFSv4.0 is described in [20]. It generally follows the
guidelines for minor versioning model listed in Section 10 of RFC guidelines for minor versioning model listed in Section 10 of RFC
3530. However, it diverges from guidelines 11 ("a client and server 3530. However, it diverges from guidelines 11 ("a client and server
that supports minor version X must support minor versions 0 through that supports minor version X must support minor versions 0 through
X-1"), and 12 ("no features may be introduced as mandatory in a minor X-1"), and 12 ("no features may be introduced as mandatory in a minor
version"). These divergences are due to the introduction of the version"). These divergences are due to the introduction of the
sessions model for managing non-idempotent operations and the sessions model for managing non-idempotent operations and the
RECLAIM_COMPLETE operation. These two new features are RECLAIM_COMPLETE operation. These two new features are
infrastructural in nature and simplify implementation of existing and infrastructural in nature and simplify implementation of existing and
other new features. Making them anything but REQUIRED would add other new features. Making them anything but REQUIRED would add
undue complexity to protocol definition and implementation. NFSv4.1 undue complexity to protocol definition and implementation. NFSv4.1
skipping to change at page 11, line 45 skipping to change at page 11, line 45
o describe the NFSv4.0 protocol, except where needed to contrast o describe the NFSv4.0 protocol, except where needed to contrast
with NFSv4.1. with NFSv4.1.
o modify the specification of the NFSv4.0 protocol. o modify the specification of the NFSv4.0 protocol.
o clarify the NFSv4.0 protocol. o clarify the NFSv4.0 protocol.
1.3. NFSv4 Goals 1.3. NFSv4 Goals
The NFSv4 protocol is a further revision of the NFS protocol defined The NFSv4 protocol is a further revision of the NFS protocol defined
already by NFSv3 [22]. It retains the essential characteristics of already by NFSv3 [21]. It retains the essential characteristics of
previous versions: easy recovery; independence of transport previous versions: easy recovery; independence of transport
protocols, operating systems and file systems; simplicity; and good protocols, operating systems and file systems; simplicity; and good
performance. NFSv4 has the following goals: performance. NFSv4 has the following goals:
o Improved access and good performance on the Internet. o Improved access and good performance on the Internet.
The protocol is designed to transit firewalls easily, perform well The protocol is designed to transit firewalls easily, perform well
where latency is high and bandwidth is low, and scale to very where latency is high and bandwidth is low, and scale to very
large numbers of clients per server. large numbers of clients per server.
skipping to change at page 13, line 33 skipping to change at page 13, line 33
node. node.
Client ID A 64-bit quantity used as a unique, short-hand reference Client ID A 64-bit quantity used as a unique, short-hand reference
to a client supplied Verifier and client owner. The server is to a client supplied Verifier and client owner. The server is
responsible for supplying the client ID. responsible for supplying the client ID.
Client Owner The client owner is a unique string, opaque to the Client Owner The client owner is a unique string, opaque to the
server, which identifies a client. Multiple network connections server, which identifies a client. Multiple network connections
and source network addresses originating from those connections and source network addresses originating from those connections
may share a client owner. The server is expected to treat may share a client owner. The server is expected to treat
requests from connnections with the same client owner as coming requests from connections with the same client owner as coming
from the same client. from the same client.
File System The collection of objects on a server (as identified by File System The collection of objects on a server (as identified by
the major identifier of a Server Owner, which is defined later in the major identifier of a Server Owner, which is defined later in
this section), that share the same fsid attribute (see this section), that share the same fsid attribute (see
Section 5.8.1.9). Section 5.8.1.9).
Lease An interval of time defined by the server for which the client Lease An interval of time defined by the server for which the client
is irrevocably granted a lock. At the end of a lease period the is irrevocably granted a lock. At the end of a lease period the
lock may be revoked if the lease has not been extended. The lock lock may be revoked if the lease has not been extended. The lock
skipping to change at page 14, line 20 skipping to change at page 14, line 20
client access to a set of file systems and is identified by a client access to a set of file systems and is identified by a
Server owner. A server can span multiple network addresses. Server owner. A server can span multiple network addresses.
Server Owner The "Server Owner" identifies the server to the client. Server Owner The "Server Owner" identifies the server to the client.
The server owner consists of a major and minor identifier. When The server owner consists of a major and minor identifier. When
the client has two connections each to a peer with the same major the client has two connections each to a peer with the same major
identifier, the client assumes both peers are the same server (the identifier, the client assumes both peers are the same server (the
server namespace is the same via each connection), and assumes and server namespace is the same via each connection), and assumes and
lock state is sharable across both connections. When each peer lock state is sharable across both connections. When each peer
has both the same major and minor identifier, the client assumes has both the same major and minor identifier, the client assumes
each connection might be associatable with the same session. each connection might be associable with the same session.
Stable Storage NFSv4.1 servers must be able to recover without data Stable Storage NFSv4.1 servers must be able to recover without data
loss from multiple power failures (including cascading power loss from multiple power failures (including cascading power
failures, that is, several power failures in quick succession), failures, that is, several power failures in quick succession),
operating system failures, and hardware failure of components operating system failures, and hardware failure of components
other than the storage medium itself (for example, disk, other than the storage medium itself (for example, disk,
nonvolatile RAM). nonvolatile RAM).
Some examples of stable storage that are allowable for an NFS Some examples of stable storage that are allowable for an NFS
server include: server include:
skipping to change at page 17, line 9 skipping to change at page 17, line 9
which are then used to identify objects in subsequent operations. which are then used to identify objects in subsequent operations.
The NFSv4.1 protocol provides support for persistent filehandles, The NFSv4.1 protocol provides support for persistent filehandles,
guaranteed to be valid for the lifetime of the file system object guaranteed to be valid for the lifetime of the file system object
designated. In addition it provides support to servers to provide designated. In addition it provides support to servers to provide
filehandles with more limited validity guarantees, called volatile filehandles with more limited validity guarantees, called volatile
filehandles. filehandles.
1.6.3.2. File Attributes 1.6.3.2. File Attributes
The NFSv4.1 protocol has a rich and extensible attribute structure, The NFSv4.1 protocol has a rich and extensible file object attribute
which is divided into REQUIRED, RECOMMENDED, and named attributes. structure, which is divided into REQUIRED, RECOMMENDED, and named
attributes (see Section 5).
The acl, sacl, and dacl attributes compose a set of RECOMMENDED file Several (but not all) of the REQUIRED attributes are derived from the
attributes that make up the Access Control List (ACL) of a file attributes of NFSv3 (see definition of the fattr3 data type in [21]).
(Section 6). These attributes provide for directory and file access An example of a REQUIRED attribute is the file object's type
control beyond the model used in NFSv3. The ACL definition allows (Section 5.8.1.2) so that regular files can be distinguished from
for specification of specific sets of permissions for individual directories (also known as folders in some operating environments)
users and groups. In addition, ACL inheritance allows propagation of and other types of objects. REQUIRED attributes are discussed in
Section 5.1.
An example of three RECOMMENDED attributes are acl, sacl, and dacl.
These attributes define an Access Control List (ACL) on a file object
((Section 6). An ACL provides directory and file access control
beyond the model used in NFSv3. The ACL definition allows for
specification of specific sets of permissions for individual users
and groups. In addition, ACL inheritance allows propagation of
access permissions and restriction down a directory tree as file access permissions and restriction down a directory tree as file
system objects are created. system objects are created. RECOMMENDED attributes are discussed in
Section 5.2.
A named attribute is an opaque byte stream that is associated with a A named attribute is an opaque byte stream that is associated with a
directory or file and referred to by a string name. Named attributes directory or file and referred to by a string name. Named attributes
are meant to be used by client applications as a method to associate are meant to be used by client applications as a method to associate
application-specific data with a regular file or directory. NFSv4.1 application-specific data with a regular file or directory. NFSv4.1
modifies named attributes relative to NFSv4.0 by tightening the modifies named attributes relative to NFSv4.0 by tightening the
allowed operations in order to prevent the development of non- allowed operations in order to prevent the development of non-
interoperable implementation. See Section 5.3 for details. interoperable implementations. Named attributes are discussed in
Section 5.3.
1.6.3.3. Multi-server Namespace 1.6.3.3. Multi-server Namespace
NFSv4.1 contains a number of features to allow implementation of NFSv4.1 contains a number of features to allow implementation of
namespaces that cross server boundaries and that allow and facilitate namespaces that cross server boundaries and that allow and facilitate
a non-disruptive transfer of support for individual file systems a non-disruptive transfer of support for individual file systems
between servers. They are all based upon attributes that allow one between servers. They are all based upon attributes that allow one
file system to specify alternate or new locations for that file file system to specify alternate or new locations for that file
system. system.
skipping to change at page 21, line 28 skipping to change at page 21, line 35
Although GSS-API has an authentication service distinct from its Although GSS-API has an authentication service distinct from its
privacy and integrity services, GSS-API's authentication service is privacy and integrity services, GSS-API's authentication service is
not used for RPCSEC_GSS's authentication service. Instead, each RPC not used for RPCSEC_GSS's authentication service. Instead, each RPC
request and response header is integrity protected with the GSS-API request and response header is integrity protected with the GSS-API
integrity service, and this allows RPCSEC_GSS to offer per-RPC integrity service, and this allows RPCSEC_GSS to offer per-RPC
authentication and identity. See [4] for more information. authentication and identity. See [4] for more information.
NFSv4.1 client and servers MUST support RPCSEC_GSS's integrity and NFSv4.1 client and servers MUST support RPCSEC_GSS's integrity and
authentication service. NFSv4.1 servers MUST support RPCSEC_GSS's authentication service. NFSv4.1 servers MUST support RPCSEC_GSS's
privacy service. privacy service. NFSv4.1 clients SHOULD support RPCSEC_GSS's privacy
service.
2.2.1.1.1.2. Security mechanisms for NFSv4.1 2.2.1.1.1.2. Security mechanisms for NFSv4.1
RPCSEC_GSS, via GSS-API, normalizes access to mechanisms that provide RPCSEC_GSS, via GSS-API, normalizes access to mechanisms that provide
security services. Therefore NFSv4.1 clients and servers MUST security services. Therefore NFSv4.1 clients and servers MUST
support three security mechanisms: Kerberos V5, SPKM-3, and LIPKEY. support three security mechanisms: Kerberos V5, SPKM-3, and LIPKEY.
The use of RPCSEC_GSS requires selection of: mechanism, quality of The use of RPCSEC_GSS requires selection of: mechanism, quality of
protection (QOP), and service (authentication, integrity, privacy). protection (QOP), and service (authentication, integrity, privacy).
For the mandated security mechanisms, NFSv4.1 specifies that a QOP of For the mandated security mechanisms, NFSv4.1 specifies that a QOP of
skipping to change at page 22, line 24 skipping to change at page 22, line 31
------------------------------------------------------------------ ------------------------------------------------------------------
390003 krb5 1.2.840.113554.1.2.2 rpc_gss_svc_none yes yes 390003 krb5 1.2.840.113554.1.2.2 rpc_gss_svc_none yes yes
390004 krb5i 1.2.840.113554.1.2.2 rpc_gss_svc_integrity yes yes 390004 krb5i 1.2.840.113554.1.2.2 rpc_gss_svc_integrity yes yes
390005 krb5p 1.2.840.113554.1.2.2 rpc_gss_svc_privacy no yes 390005 krb5p 1.2.840.113554.1.2.2 rpc_gss_svc_privacy no yes
Note that the number and name of the pseudo flavor is presented here Note that the number and name of the pseudo flavor is presented here
as a mapping aid to the implementor. Because the NFSv4.1 protocol as a mapping aid to the implementor. Because the NFSv4.1 protocol
includes a method to negotiate security and it understands the GSS- includes a method to negotiate security and it understands the GSS-
API mechanism, the pseudo flavor is not needed. The pseudo flavor is API mechanism, the pseudo flavor is not needed. The pseudo flavor is
needed for the NFSv3 since the security negotiation is done via the needed for the NFSv3 since the security negotiation is done via the
MOUNT protocol as described in [23]. MOUNT protocol as described in [22].
2.2.1.1.1.2.2. LIPKEY 2.2.1.1.1.2.2. LIPKEY
The LIPKEY V5 GSS-API mechanism as described in [6] MUST be The LIPKEY V5 GSS-API mechanism as described in [6] MUST be
implemented with the RPCSEC_GSS services as specified in the implemented with the RPCSEC_GSS services as specified in the
following table: following table:
1 2 3 4 5 6 1 2 3 4 5 6
------------------------------------------------------------------ ------------------------------------------------------------------
390006 lipkey 1.3.6.1.5.5.9 rpc_gss_svc_none yes yes 390006 lipkey 1.3.6.1.5.5.9 rpc_gss_svc_none yes yes
skipping to change at page 23, line 49 skipping to change at page 24, line 6
With the use of the COMPOUND procedure, the client is able to build With the use of the COMPOUND procedure, the client is able to build
simple or complex requests. These COMPOUND requests allow for a simple or complex requests. These COMPOUND requests allow for a
reduction in the number of RPCs needed for logical file system reduction in the number of RPCs needed for logical file system
operations. For example, multi-component lookup requests can be operations. For example, multi-component lookup requests can be
constructed by combining multiple LOOKUP operations. Those can be constructed by combining multiple LOOKUP operations. Those can be
further combined with operations such as GETATTR, READDIR, or OPEN further combined with operations such as GETATTR, READDIR, or OPEN
plus READ to do more complicated sets of operation without incurring plus READ to do more complicated sets of operation without incurring
additional latency. additional latency.
NFSv4.1 also contains a considerable set of callback operations in NFSv4.1 also contains a considerable set of callback operations in
which the server makes an RPC directed at the client. Callback RPC's which the server makes an RPC directed at the client. Callback RPCs
have a similar structure to that of the normal server requests. In have a similar structure to that of the normal server requests. In
all minor versions of the NFSv4 protocol there are two callback RPC all minor versions of the NFSv4 protocol there are two callback RPC
procedures, CB_NULL and CB_COMPOUND. The CB_COMPOUND procedure is procedures, CB_NULL and CB_COMPOUND. The CB_COMPOUND procedure is
defined in an analogous fashion to that of COMPOUND with its own set defined in an analogous fashion to that of COMPOUND with its own set
of callback operations. of callback operations.
The addition of new server and callback operations within the The addition of new server and callback operations within the
COMPOUND and CB_COMPOUND request framework provides a means of COMPOUND and CB_COMPOUND request framework provides a means of
extending the protocol in subsequent minor versions. extending the protocol in subsequent minor versions.
skipping to change at page 25, line 47 skipping to change at page 26, line 5
same string. The implementor is cautioned from an approach that same string. The implementor is cautioned from an approach that
requires the string to be recorded in a local file because this requires the string to be recorded in a local file because this
precludes the use of the implementation in an environment where precludes the use of the implementation in an environment where
there is no local disk and all file access is from an NFSv4.1 there is no local disk and all file access is from an NFSv4.1
server. server.
o The string should be the same for each server network address that o The string should be the same for each server network address that
the client accesses. This way, if a server has multiple the client accesses. This way, if a server has multiple
interfaces, the client can trunk traffic over multiple network interfaces, the client can trunk traffic over multiple network
paths as described in Section 2.10.4. (Note: the precise opposite paths as described in Section 2.10.4. (Note: the precise opposite
was advised in the NFSv4.0 specification [21].) was advised in the NFSv4.0 specification [20].)
o The algorithm for generating the string should not assume that the o The algorithm for generating the string should not assume that the
client's network address will not change, unless the client client's network address will not change, unless the client
implementation knows it is using statically assigned network implementation knows it is using statically assigned network
addresses. This includes changes between client incarnations and addresses. This includes changes between client incarnations and
even changes while the client is still running in its current even changes while the client is still running in its current
incarnation. Thus with dynamic address assignment, if the client incarnation. Thus with dynamic address assignment, if the client
includes just the client's network address in the co_ownerid includes just the client's network address in the co_ownerid
string, there is a real risk that after the client gives up the string, there is a real risk that after the client gives up the
network address, another client, using a similar algorithm for network address, another client, using a similar algorithm for
skipping to change at page 27, line 47 skipping to change at page 28, line 4
See the descriptions of EXCHANGE_ID (Section 18.35) and See the descriptions of EXCHANGE_ID (Section 18.35) and
CREATE_SESSION (Section 18.36) for a complete specification of these CREATE_SESSION (Section 18.36) for a complete specification of these
operations. operations.
2.4.1. Upgrade from NFSv4.0 to NFSv4.1 2.4.1. Upgrade from NFSv4.0 to NFSv4.1
To facilitate upgrade from NFSv4.0 to NFSv4.1, a server may compare a To facilitate upgrade from NFSv4.0 to NFSv4.1, a server may compare a
client_owner4 in an EXCHANGE_ID with an nfs_client_id4 established client_owner4 in an EXCHANGE_ID with an nfs_client_id4 established
using the SETCLIENTID operation of NFSv4.0. A server that does so using the SETCLIENTID operation of NFSv4.0. A server that does so
will allow an upgraded client to avoid waiting until the lease (i.e. will allow an upgraded client to avoid waiting until the lease (i.e.
the lease established by the NFSv4.0 instance client) expires. This the lease established by the NFSv4.0 instance client) expires. This
requires the client_owner4 be constructed the same way as the requires the client_owner4 be constructed the same way as the
nfs_client_id4. If the latter's contents included the server's nfs_client_id4. If the latter's contents included the server's
network address (per the recommendations of the NFSv4.0 specification network address (per the recommendations of the NFSv4.0 specification
[21]), and the NFSv4.1 client does not wish to use a client ID that [20]), and the NFSv4.1 client does not wish to use a client ID that
prevents trunking, it should send two EXCHANGE_ID operations. The prevents trunking, it should send two EXCHANGE_ID operations. The
first EXCHANGE_ID will have a client_owner4 equal to the first EXCHANGE_ID will have a client_owner4 equal to the
nfs_client_id4. This will clear the state created by the NFSv4.0 nfs_client_id4. This will clear the state created by the NFSv4.0
client. The second EXCHANGE_ID will not have the server's network client. The second EXCHANGE_ID will not have the server's network
address. The state created for the second EXCHANGE_ID will not have address. The state created for the second EXCHANGE_ID will not have
to wait for lease expiration, because there will be no state to to wait for lease expiration, because there will be no state to
expire. expire.
2.4.2. Server Release of Client ID 2.4.2. Server Release of Client ID
skipping to change at page 35, line 24 skipping to change at page 35, line 34
operation will fail with NFS4ERR_WRONGSEC. After a SECINFO_NO_NAME operation will fail with NFS4ERR_WRONGSEC. After a SECINFO_NO_NAME
request, the client sends SEQUENCE, PUTFH bFH, SAVEFH, PUTFH aFH, request, the client sends SEQUENCE, PUTFH bFH, SAVEFH, PUTFH aFH,
RENAME "c" "d", using credentials acceptable to aFH's security RENAME "c" "d", using credentials acceptable to aFH's security
policy, but not bFH's policy. The server returns NFS4ERR_WRONGSEC on policy, but not bFH's policy. The server returns NFS4ERR_WRONGSEC on
the RENAME operation. the RENAME operation.
To prevent a client from an endless sequence of a request containing To prevent a client from an endless sequence of a request containing
LINK or RENAME, followed by a request containing SECINFO_NO_NAME, the LINK or RENAME, followed by a request containing SECINFO_NO_NAME, the
server MUST detect when the security policies of the current and server MUST detect when the security policies of the current and
saved filehandles have no mutually acceptable security tuple, and saved filehandles have no mutually acceptable security tuple, and
MUST NOT NFS4ERR_WRONGSEC in that situation. Instead the server MUST MUST NOT return NFS4ERR_WRONGSEC in that situation. Instead the
return NFS4ERR_XDEV. server MUST return NFS4ERR_XDEV.
Thus while a server MAY return NFS4ERR_WRONGSEC from LINK and RENAME, Thus while a server MAY return NFS4ERR_WRONGSEC from LINK and RENAME,
the server implementor may reasonably decide the consequences are not the server implementor may reasonably decide the consequences are not
worth the security benefits, and so allow the security policy of the worth the security benefits, and so allow the security policy of the
current filehandle to override that of the saved filehandle. current filehandle to override that of the saved filehandle.
2.7. Minor Versioning 2.7. Minor Versioning
To address the requirement of an NFS protocol that can evolve as the To address the requirement of an NFS protocol that can evolve as the
need arises, the NFSv4.1 protocol contains the rules and framework to need arises, the NFSv4.1 protocol contains the rules and framework to
allow for future minor changes or versioning. allow for future minor changes or versioning.
The base assumption with respect to minor versioning is that any The base assumption with respect to minor versioning is that any
future accepted minor version must follow the IETF process and be future accepted minor version must follow the IETF process and be
documented in a standards track RFC. Therefore, each minor version documented in a standards track RFC. Therefore, each minor version
number will correspond to an RFC. Minor version zero of the NFSv4 number will correspond to one or more new RFCs. Minor version zero
protocol is represented by [21], and minor version one is represented of the NFSv4 protocol is represented by [20], and minor version one
by this document [[Comment.1: RFC Editor: change "document" to "RFC" is represented by this document [[Comment.1: RFC Editor: change
when we publish]]. The COMPOUND and CB_COMPOUND procedures support "document" to "RFC" when we publish]]. The COMPOUND and CB_COMPOUND
the encoding of the minor version being requested by the client. procedures support the encoding of the minor version being requested
by the client.
The following items represent the basic rules for the development of The following items represent the basic rules for the development of
minor versions. Note that a future minor version may decide to minor versions. Note that a future minor version may decide to
modify or add to the following rules as part of the minor version modify or add to the following rules as part of the minor version
definition. definition.
1. Procedures are not added or deleted 1. Procedures are not added or deleted
To maintain the general RPC model, NFSv4 minor versions will not To maintain the general RPC model, NFSv4 minor versions will not
add to or delete procedures from the NFS program. add to or delete procedures from the NFS program.
skipping to change at page 38, line 48 skipping to change at page 39, line 12
NFSv4.1 provides alarm control on a per file object basis, via the NFSv4.1 provides alarm control on a per file object basis, via the
acl and sacl attributes as described in Section 6. Alarms may serve acl and sacl attributes as described in Section 6. Alarms may serve
as the basis for intrusion detection. It is outside the scope of as the basis for intrusion detection. It is outside the scope of
this specification to specify heuristics for detecting intrusion via this specification to specify heuristics for detecting intrusion via
alarms. alarms.
2.9. Transport Layers 2.9. Transport Layers
2.9.1. REQUIRED and RECOMMENDED Properties of Transports 2.9.1. REQUIRED and RECOMMENDED Properties of Transports
NFSv4.1 works over RDMA and non-RDMA_based transports with the NFSv4.1 works over RDMA and non-RDMA-based transports with the
following attributes: following attributes:
o The transport supports reliable delivery of data, which NFSv4.1 o The transport supports reliable delivery of data, which NFSv4.1
requires but neither NFSv4.1 nor RPC has facilities for ensuring. requires but neither NFSv4.1 nor RPC has facilities for ensuring.
[23]
[24]
o The transport delivers data in the order it was sent. Ordered o The transport delivers data in the order it was sent. Ordered
delivery simplifies detection of transmit errors, and simplifies delivery simplifies detection of transmit errors, and simplifies
the sending of arbitrary sized requests and responses, via the the sending of arbitrary sized requests and responses, via the
record marking protocol [3]. record marking protocol [3].
Where an NFSv4.1 implementation supports operation over the IP Where an NFSv4.1 implementation supports operation over the IP
network protocol, any transport used between NFS and IP MUST be among network protocol, any transport used between NFS and IP MUST be among
the IETF-approved congestion control transport protocols. At the the IETF-approved congestion control transport protocols. At the
time this document was written, the only two transports that had the time this document was written, the only two transports that had the
above attributes were TCP and SCTP. To enhance the possibilities for above attributes were TCP and SCTP. To enhance the possibilities for
interoperability, an NFSv4.1 implementation MUST support operation interoperability, an NFSv4.1 implementation MUST support operation
over the TCP transport protocol. over the TCP transport protocol.
Even if NFSv4.1 is used over a non-IP network protocol, it is Even if NFSv4.1 is used over a non-IP network protocol, it is
RECOMMENDED that the transport support congestion control. RECOMMENDED that the transport support congestion control.
It is permissible for a connectionless transport to be used under It is permissible for a connectionless transport to be used under
NFSv4.1, however reliable and in-order delivery of data by the NFSv4.1, however reliable and in-order delivery of data combined with
connectionless transport is REQUIRED. NFSv4.1 assumes that a client congestion control by the connectionless transport is REQUIRED.
transport address and server transport address used to send data over NFSv4.1 assumes that a client transport address and server transport
a transport together constitute a connection, even if the underlying address used to send data over a transport together constitute a
transport eschews the concept of a connection. connection, even if the underlying transport eschews the concept of a
connection.
2.9.2. Client and Server Transport Behavior 2.9.2. Client and Server Transport Behavior
If a connection-oriented transport (e.g. TCP) is used, the client If a connection-oriented transport (e.g. TCP) is used, the client
and server SHOULD use long lived connections for at least three and server SHOULD use long lived connections for at least three
reasons: reasons:
1. This will prevent the weakening of the transport's congestion 1. This will prevent the weakening of the transport's congestion
control mechanisms via short lived connections. control mechanisms via short lived connections.
skipping to change at page 41, line 8 skipping to change at page 41, line 21
contents must not be blindly used when replies are sent from it, contents must not be blindly used when replies are sent from it,
and credit information appropriate to the channel must be and credit information appropriate to the channel must be
refreshed by the RPC layer. refreshed by the RPC layer.
In addition, as described in Section 2.10.5.2, while a session is In addition, as described in Section 2.10.5.2, while a session is
active, the NFSv4.1 requester MUST NOT stop waiting for a reply. active, the NFSv4.1 requester MUST NOT stop waiting for a reply.
2.9.3. Ports 2.9.3. Ports
Historically, NFSv3 servers have listened over TCP port 2049. The Historically, NFSv3 servers have listened over TCP port 2049. The
registered port 2049 [25] for the NFS protocol should be the default registered port 2049 [24] for the NFS protocol should be the default
configuration. NFSv4.1 clients SHOULD NOT use the RPC binding configuration. NFSv4.1 clients SHOULD NOT use the RPC binding
protocols as described in [26]. protocols as described in [25].
2.10. Session 2.10. Session
2.10.1. Motivation and Overview 2.10.1. Motivation and Overview
Previous versions and minor versions of NFS have suffered from the Previous versions and minor versions of NFS have suffered from the
following: following:
o Lack of support for Exactly Once Semantics (EOS). This includes o Lack of support for Exactly Once Semantics (EOS). This includes
lack of support for EOS through server failure and recovery. lack of support for EOS through server failure and recovery.
skipping to change at page 43, line 15 skipping to change at page 43, line 28
associates all other operations in the COMPOUND procedure with a associates all other operations in the COMPOUND procedure with a
particular session. SEQUENCE also contains required information for particular session. SEQUENCE also contains required information for
maintaining EOS (see Section 2.10.5). Session-enabled NFSv4.1 maintaining EOS (see Section 2.10.5). Session-enabled NFSv4.1
COMPOUND requests thus have the form: COMPOUND requests thus have the form:
+-----+--------------+-----------+------------+-----------+---- +-----+--------------+-----------+------------+-----------+----
| tag | minorversion | numops |SEQUENCE op | op + args | ... | tag | minorversion | numops |SEQUENCE op | op + args | ...
| | (== 1) | (limited) | + args | | | | (== 1) | (limited) | + args | |
+-----+--------------+-----------+------------+-----------+---- +-----+--------------+-----------+------------+-----------+----
and the reply's structure is: and the replys have the form:
+------------+-----+--------+-------------------------------+--// +------------+-----+--------+-------------------------------+--//
|last status | tag | numres |status + SEQUENCE op + results | // |last status | tag | numres |status + SEQUENCE op + results | //
+------------+-----+--------+-------------------------------+--// +------------+-----+--------+-------------------------------+--//
//-----------------------+---- //-----------------------+----
// status + op + results | ... // status + op + results | ...
//-----------------------+---- //-----------------------+----
A CB_COMPOUND procedure request and reply has a similar form to A CB_COMPOUND procedure request and reply has a similar form to
COMPOUND, but instead of a SEQUENCE operation, there is a CB_SEQUENCE COMPOUND, but instead of a SEQUENCE operation, there is a CB_SEQUENCE
skipping to change at page 44, line 32 skipping to change at page 44, line 45
of NFSv4.1 require a backchannel. NFSv4.1 servers MUST support of NFSv4.1 require a backchannel. NFSv4.1 servers MUST support
backchannels. backchannels.
Each session has resources for each channel, including separate reply Each session has resources for each channel, including separate reply
caches (see Section 2.10.5.1). Note that even the backchannel caches (see Section 2.10.5.1). Note that even the backchannel
requires a reply cache because some callback operations are requires a reply cache because some callback operations are
nonidempotent. nonidempotent.
2.10.3.1. Association of Connections, Channels, and Sessions 2.10.3.1. Association of Connections, Channels, and Sessions
Each channel is associated with zero or more transport connections. Each channel is associated with zero or more transport connections
A connection can be associated with one channel or both channels of a (whether of the same transport protocol or different transport
session; the client and server negotiate whether a connection will protocols). A connection can be associated with one channel or both
carry traffic for one channel or both channels via the CREATE_SESSION channels of a session; the client and server negotiate whether a
(Section 18.36) and the BIND_CONN_TO_SESSION (Section 18.34) connection will carry traffic for one channel or both channels via
operations. When a session is created via CREATE_SESSION, the the CREATE_SESSION (Section 18.36) and the BIND_CONN_TO_SESSION
connection that transported the CREATE_SESSION request is (Section 18.34) operations. When a session is created via
automatically associated with the fore channel, and optionally the CREATE_SESSION, the connection that transported the CREATE_SESSION
backchannel. If the client specifies no state protection request is automatically associated with the fore channel, and
(Section 18.35) when the session is created, then when SEQUENCE is optionally the backchannel. If the client specifies no state
transmitted on a different connection, the connection is protection (Section 18.35) when the session is created, then when
SEQUENCE is transmitted on a different connection, the connection is
automatically associated with the fore channel of the session automatically associated with the fore channel of the session
specified in the SEQUENCE operation. specified in the SEQUENCE operation.
A connection's association with a session is not exclusive. A A connection's association with a session is not exclusive. A
connection associated with the channel(s) of one session may be connection associated with the channel(s) of one session may be
simultaneously associated with the channel(s) of other sessions simultaneously associated with the channel(s) of other sessions
including sessions associated with other client IDs. including sessions associated with other client IDs.
It is permissible for connections of multiple transport types to be It is permissible for connections of multiple transport types to be
associated with the same channel. For example both a TCP and RDMA associated with the same channel. For example both a TCP and RDMA
skipping to change at page 45, line 22 skipping to change at page 45, line 37
It is permissible for a connection of one type of transport to be It is permissible for a connection of one type of transport to be
associated with the fore channel, and a connection of a different associated with the fore channel, and a connection of a different
type to be associated with the backchannel. type to be associated with the backchannel.
2.10.4. Trunking 2.10.4. Trunking
Trunking is the use of multiple connections between a client and Trunking is the use of multiple connections between a client and
server in order to increase the speed of data transfer. NFSv4.1 server in order to increase the speed of data transfer. NFSv4.1
supports two types of trunking: session trunking and client ID supports two types of trunking: session trunking and client ID
trunking. NFSv4.1 servers MUST support trunking. trunking. NFSv4.1 repliers and requesters MUST support session
trunking. NFSv4.1 servers MAY support client ID trunking. NFSv4.1
clients MUST support client ID trunking.
Session trunking is essentially the association of multiple Session trunking is essentially the association of multiple
connections, each with potentially different target and/or source connections, each with potentially different target and/or source
network addresses, to the same session. network addresses, to the same session.
Client ID trunking is the association of multiple sessions to the Client ID trunking is the association of multiple sessions to the
same client ID, major server owner ID (Section 2.5), and server scope same client ID, major server owner ID (Section 2.5), and server scope
(Section 11.7.7). When two servers return the same major server (Section 11.7.7). When two servers return the same major server
owner and server scope it means the two servers are cooperating on owner and server scope it means the two servers are cooperating on
locking state management which is a prerequisite for client ID locking state management which is a prerequisite for client ID
skipping to change at page 51, line 6 skipping to change at page 51, line 21
o A new request, in which the sequence ID is one greater than that o A new request, in which the sequence ID is one greater than that
previously seen in the slot (accounting for sequence wraparound). previously seen in the slot (accounting for sequence wraparound).
The replier proceeds to execute the new request, and the replier The replier proceeds to execute the new request, and the replier
MUST increase the slot's sequence ID by one. MUST increase the slot's sequence ID by one.
o A retransmitted request, in which the sequence ID is equal to that o A retransmitted request, in which the sequence ID is equal to that
currently recorded in the slot. If the original request has currently recorded in the slot. If the original request has
executed to completion, the replier returns the cached reply. See executed to completion, the replier returns the cached reply. See
Section 2.10.5.2 for direction on how the replier deals with Section 2.10.5.2 for direction on how the replier deals with
retries of requests that are stll in progress. retries of requests that are still in progress.
o A misordered retry, in which the sequence ID is less than o A misordered retry, in which the sequence ID is less than
(accounting for sequence wraparound) that previously seen in the (accounting for sequence wraparound) that previously seen in the
slot. The replier MUST return NFS4ERR_SEQ_MISORDERED (as the slot. The replier MUST return NFS4ERR_SEQ_MISORDERED (as the
result from SEQUENCE or CB_SEQUENCE). result from SEQUENCE or CB_SEQUENCE).
o A misordered new request, in which the sequence ID is two or more o A misordered new request, in which the sequence ID is two or more
than (accounting for sequence wraparound) than that previously than (accounting for sequence wraparound) than that previously
seen in the slot. Note that because the sequence ID must seen in the slot. Note that because the sequence ID must
wraparound to zero (0) once it reaches 0xFFFFFFFF, a misordered wraparound to zero (0) once it reaches 0xFFFFFFFF, a misordered
skipping to change at page 52, line 22 skipping to change at page 52, line 37
in the request will not be in the reply, and the requester has in the request will not be in the reply, and the requester has
only the XID to match the reply to the request. only the XID to match the reply to the request.
Given that well formulated XIDs continue to be required, this begs Given that well formulated XIDs continue to be required, this begs
the question why SEQUENCE and CB_SEQUENCE replies have a session ID, the question why SEQUENCE and CB_SEQUENCE replies have a session ID,
slot ID and sequence ID? Having the session ID in the reply means slot ID and sequence ID? Having the session ID in the reply means
the requester does not have to use the XID to lookup the session ID, the requester does not have to use the XID to lookup the session ID,
which would be necessary if the connection were associated with which would be necessary if the connection were associated with
multiple sessions. Having the slot ID and sequence ID in the reply multiple sessions. Having the slot ID and sequence ID in the reply
means requester does not have to use the XID to lookup the slot ID means requester does not have to use the XID to lookup the slot ID
and sequence ID. Furhermore, since the XID is only 32 bits, it is and sequence ID. Furthermore, since the XID is only 32 bits, it is
too small to guarantee the re-association of a reply with its request too small to guarantee the re-association of a reply with its request
([27]); having session ID, slot ID, and sequence ID in the reply ([26]); having session ID, slot ID, and sequence ID in the reply
allows the client to validate that the reply in fact belongs to the allows the client to validate that the reply in fact belongs to the
matched request. matched request.
The SEQUENCE (and CB_SEQUENCE) operation also carries a The SEQUENCE (and CB_SEQUENCE) operation also carries a
"highest_slotid" value which carries additional requester slot usage "highest_slotid" value which carries additional requester slot usage
information. The requester must always indicate the slot ID information. The requester must always indicate the slot ID
representing the outstanding request with the highest-numbered slot representing the outstanding request with the highest-numbered slot
value. The requester should in all cases provide the most value. The requester should in all cases provide the most
conservative value possible, although it can be increased somewhat conservative value possible, although it can be increased somewhat
above the actual instantaneous usage to maintain some minimum or above the actual instantaneous usage to maintain some minimum or
skipping to change at page 54, line 51 skipping to change at page 55, line 19
cache entry for the slot whenever an error is returned from SEQUENCE cache entry for the slot whenever an error is returned from SEQUENCE
or CB_SEQUENCE. or CB_SEQUENCE.
2.10.5.1.3. Optional Reply Caching 2.10.5.1.3. Optional Reply Caching
On a per-request basis the requester can choose to direct the replier On a per-request basis the requester can choose to direct the replier
to cache the reply to all operations after the first operation to cache the reply to all operations after the first operation
(SEQUENCE or CB_SEQUENCE) via the sa_cachethis or csa_cachethis (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or csa_cachethis
fields of the arguments to SEQUENCE or CB_SEQUENCE. The reason it fields of the arguments to SEQUENCE or CB_SEQUENCE. The reason it
would not direct the replier to cache the entire reply is that the would not direct the replier to cache the entire reply is that the
request is composed of all idempotent operations [24]. Caching the request is composed of all idempotent operations [23]. Caching the
reply may offer little benefit. If the reply is too large (see reply may offer little benefit. If the reply is too large (see
Section 2.10.5.4), it may not be cacheable anyway. Even if the reply Section 2.10.5.4), it may not be cacheable anyway. Even if the reply
to idempotent request is small enough to cache, unnecessarily caching to idempotent request is small enough to cache, unnecessarily caching
the reply slows down the server and increases RPC latency. the reply slows down the server and increases RPC latency.
Whether the requester requests the reply to be cached or not has no Whether the requester requests the reply to be cached or not has no
effect on the slot processing. If the results of SEQUENCE or effect on the slot processing. If the results of SEQUENCE or
CB_SEQUENCE are NFS4_OK, then the slot's sequence ID MUST be CB_SEQUENCE are NFS4_OK, then the slot's sequence ID MUST be
incremented by one. If a requester does not direct the replier to incremented by one. If a requester does not direct the replier to
cache the reply, the replier MUST do one of following: cache the reply, the replier MUST do one of following:
skipping to change at page 61, line 7 skipping to change at page 61, line 21
view the problem is as a single transaction consisting of each view the problem is as a single transaction consisting of each
operation in the COMPOUND followed by storing the result in operation in the COMPOUND followed by storing the result in
persistent storage, then finally a transaction commit. If there is a persistent storage, then finally a transaction commit. If there is a
failure before the transaction is committed, then the server rolls failure before the transaction is committed, then the server rolls
back the transaction. If server itself fails, then when it restarts, back the transaction. If server itself fails, then when it restarts,
its recovery logic could roll back the transaction before starting its recovery logic could roll back the transaction before starting
the NFSv4.1 server. the NFSv4.1 server.
While the description of the implementation for atomic execution of While the description of the implementation for atomic execution of
the request and caching of the reply is beyond the scope of this the request and caching of the reply is beyond the scope of this
document, an example implementation for NFSv2 [28] is described in document, an example implementation for NFSv2 [27] is described in
[29]. [28].
2.10.6. RDMA Considerations 2.10.6. RDMA Considerations
A complete discussion of the operation of RPC-based protocols over A complete discussion of the operation of RPC-based protocols over
RDMA transports is in [8]. A discussion of the operation of NFSv4, RDMA transports is in [8]. A discussion of the operation of NFSv4,
including NFSv4.1, over RDMA is in [9]. Where RDMA is considered, including NFSv4.1, over RDMA is in [9]. Where RDMA is considered,
this specification assumes the use of such a layering; it addresses this specification assumes the use of such a layering; it addresses
only the upper layer issues relevant to making best use of RPC/RDMA. only the upper layer issues relevant to making best use of RPC/RDMA.
2.10.6.1. RDMA Connection Resources 2.10.6.1. RDMA Connection Resources
skipping to change at page 62, line 10 skipping to change at page 62, line 25
Previous versions of NFS do not provide flow control; instead they Previous versions of NFS do not provide flow control; instead they
rely on the windowing provided by transports like TCP to throttle rely on the windowing provided by transports like TCP to throttle
requests. This does not work with RDMA, which provides no operation requests. This does not work with RDMA, which provides no operation
flow control and will terminate a connection in error when limits are flow control and will terminate a connection in error when limits are
exceeded. Limits such as maximum number of requests outstanding are exceeded. Limits such as maximum number of requests outstanding are
therefore negotiated when a session is created (see the therefore negotiated when a session is created (see the
ca_maxrequests field in Section 18.36). These limits then provide ca_maxrequests field in Section 18.36). These limits then provide
the maxima which each connection associated with the session's the maxima which each connection associated with the session's
channel(s) must remain within. RDMA connections are managed within channel(s) must remain within. RDMA connections are managed within
these limits as described in section 3.3 ("Flow Control"[[Comment.2: these limits as described in section 3.3 ("Flow Control"[[Comment.2:
RFC Editor: please verify section and title of the RPCRDMA RFC Editor: please verify section and title of the RPCRDMA document
document]]) of [8]; if there are multiple RDMA connections, then the which is currently at
maximum number of requests for a channel will be divided among the http://tools.ietf.org/html/draft-ietf-nfsv4-rpcrdma-08#section-3.3]])
RDMA connections. Put a different way, the onus is on the replier to of [8]; if there are multiple RDMA connections, then the maximum
number of requests for a channel will be divided among the RDMA
connections. Put a different way, the onus is on the replier to
ensure that total number of RDMA credits across all connections ensure that total number of RDMA credits across all connections
associated with the replier's channel does exceed the channel's associated with the replier's channel does exceed the channel's
maximum number of outstanding requests. maximum number of outstanding requests.
The limits may also be modified dynamically at the replier's choosing The limits may also be modified dynamically at the replier's choosing
by manipulating certain parameters present in each NFSv4.1 reply. In by manipulating certain parameters present in each NFSv4.1 reply. In
addition, the CB_RECALL_SLOT callback operation (see Section 20.8) addition, the CB_RECALL_SLOT callback operation (see Section 20.8)
can be sent by a server to a client to return RDMA credits to the can be sent by a server to a client to return RDMA credits to the
server, thereby lowering the maximum number of requests a client can server, thereby lowering the maximum number of requests a client can
have outstanding to the server. have outstanding to the server.
skipping to change at page 64, line 18 skipping to change at page 64, line 35
2.10.7.2. Backchannel RPC Security 2.10.7.2. Backchannel RPC Security
When the NFSv4.1 client establishes the backchannel, it informs the When the NFSv4.1 client establishes the backchannel, it informs the
server of the security flavors and principals to use when sending server of the security flavors and principals to use when sending
requests. If the security flavor is RPCSEC_GSS, the client expresses requests. If the security flavor is RPCSEC_GSS, the client expresses
the principal in the form of an established RPCSEC_GSS context. The the principal in the form of an established RPCSEC_GSS context. The
server is free to use any of the flavor/principal combinations the server is free to use any of the flavor/principal combinations the
client offers, but it MUST NOT use unoffered combinations. This way, client offers, but it MUST NOT use unoffered combinations. This way,
the client need not provide a target GSS principal for the the client need not provide a target GSS principal for the
backchannel as it did with NFSv4.0, nor the server have to implement backchannel as it did with NFSv4.0, nor the server have to implement
an RPCSEC_GSS initiator as it did with NFSv4.0 [21]. an RPCSEC_GSS initiator as it did with NFSv4.0 [20].
The CREATE_SESSION (Section 18.36) and BACKCHANNEL_CTL The CREATE_SESSION (Section 18.36) and BACKCHANNEL_CTL
(Section 18.33) operations allow the client to specify flavor/ (Section 18.33) operations allow the client to specify flavor/
principal combinations. principal combinations.
Also note that the SP4_SSV state protection mode (see Section 18.35 Also note that the SP4_SSV state protection mode (see Section 18.35
and Section 2.10.7.3) has the side benefit of providing SSV-derived and Section 2.10.7.3) has the side benefit of providing SSV-derived
RPCSEC_GSS contexts (Section 2.10.8). RPCSEC_GSS contexts (Section 2.10.8).
2.10.7.3. Protection from Unauthorized State Changes 2.10.7.3. Protection from Unauthorized State Changes
skipping to change at page 80, line 41 skipping to change at page 81, line 19
| | Various defined file types. | | | Various defined file types. |
| nfsstat4 | enum nfsstat4; | | nfsstat4 | enum nfsstat4; |
| | Return value for operations. | | | Return value for operations. |
| offset4 | typedef uint64_t offset4; | | offset4 | typedef uint64_t offset4; |
| | Various offset designations (READ, WRITE, LOCK, | | | Various offset designations (READ, WRITE, LOCK, |
| | COMMIT). | | | COMMIT). |
| qop4 | typedef uint32_t qop4; | | qop4 | typedef uint32_t qop4; |
| | Quality of protection designation in SECINFO. | | | Quality of protection designation in SECINFO. |
| sec_oid4 | typedef opaque sec_oid4<>; | | sec_oid4 | typedef opaque sec_oid4<>; |
| | Security Object Identifier. The sec_oid4 data | | | Security Object Identifier. The sec_oid4 data |
| | type is not really opaque. Instead it contains | | | type is not really opaque. Instead it contains an |
| | an ASN.1 OBJECT IDENTIFIER as used by GSS-API in | | | ASN.1 OBJECT IDENTIFIER as used by GSS-API in the |
| | the mech_type argument to GSS_Init_sec_context. | | | mech_type argument to GSS_Init_sec_context. See |
| | See [7] for details. | | | [7] for details. |
| sequenceid4 | typedef uint32_t sequenceid4; | | sequenceid4 | typedef uint32_t sequenceid4; |
| | Sequence number used for various session | | | Sequence number used for various session |
| | operations (EXCHANGE_ID, CREATE_SESSION, | | | operations (EXCHANGE_ID, CREATE_SESSION, |
| | SEQUENCE, CB_SEQUENCE). | | | SEQUENCE, CB_SEQUENCE). |
| seqid4 | typedef uint32_t seqid4; | | seqid4 | typedef uint32_t seqid4; |
| | Sequence identifier used for file locking. | | | Sequence identifier used for file locking. |
| sessionid4 | typedef opaque sessionid4[NFS4_SESSIONID_SIZE]; | | sessionid4 | typedef opaque sessionid4[NFS4_SESSIONID_SIZE]; |
| | Session identifier. | | | Session identifier. |
| slotid4 | typedef uint32_t slotid4; | | slotid4 | typedef uint32_t slotid4; |
| | Sequencing artifact for various session | | | Sequencing artifact for various session |
skipping to change at page 84, line 13 skipping to change at page 84, line 46
resides. resides.
3.3.9. netaddr4 3.3.9. netaddr4
struct netaddr4 { struct netaddr4 {
/* see struct rpcb in RFC 1833 */ /* see struct rpcb in RFC 1833 */
string na_r_netid<>; /* network id */ string na_r_netid<>; /* network id */
string na_r_addr<>; /* universal address */ string na_r_addr<>; /* universal address */
}; };
The netaddr4 data type is used to identify TCP/IP based endpoints. The netaddr4 data type is used to identify network transport
The r_netid and r_addr fields are specified in RFC1833 [26], but they endpoints. The r_netid and r_addr fields respectively contain a
are underspecified in RFC1833 [26] as far as what they should look netid and uaddr. The netid and uaddr concepts are defined in in
like for specific protocols. The next section clarifies this. [13]. The netid and uaddr formats for TCP over IPv4 and TCP over
IPv6 are defined in [13], specifically Tables 2 and 3 and Sections
3.3.9.1. Format of netaddr4 for TCP and UDP over IPv4 3.2.3.3 and 3.2.3.4.
For TCP over IPv4 and for UDP over IPv4, the format of r_addr is the
US-ASCII string:
h1.h2.h3.h4.p1.p2
The prefix, "h1.h2.h3.h4", is the standard textual form for
representing an IPv4 address, which is always four bytes long.
Assuming big-endian ordering, h1, h2, h3, and h4, are respectively,
the first through fourth bytes each converted to ASCII-decimal. The
suffix, "p1.p2", is a textual form for representing a TCP and UDP
service port. Assuming big-endian ordering, p1 and p2 are,
respectively, the first and second bytes each converted to ASCII-
decimal. For example, if a host, in big-endian order, has an address
of 0x0A010307 and there is a service listening on, in big endian
order, port 0x020F (decimal 527), then the complete universal address
is "10.1.3.7.2.15".
For TCP over IPv4 the value of r_netid is the string "tcp". For UDP
over IPv4 the value of r_netid is the string "udp". That this
document specifies the universal address and netid for UDP/IPv6 does
not imply that UDP/IPv4 is a legal transport for NFSv4.1 (see
Section 2.9).
3.3.9.2. Format of netaddr4 for TCP and UDP over IPv6
For TCP over IPv6 and for UDP over IPv6, the format of r_addr is the
US-ASCII string:
x1:x2:x3:x4:x5:x6:x7:x8.p1.p2
The suffix "p1.p2" is the service port, and is computed the same way
as with universal addresses for TCP and UDP over IPv4. The prefix,
"x1:x2:x3:x4:x5:x6:x7:x8", is the preferred textual form for
representing an IPv6 address as defined in Section 2.2 of RFC4291
[13]. Additionally, the two alternative forms specified in Section
2.2 of RFC4291 are also acceptable.
For TCP over IPv6 the value of r_netid is the string "tcp6". For UDP
over IPv6 the value of r_netid is the string "udp6". That this
document specifies the universal address and netid for UDP/IPv6 does
not imply that UDP/IPv6 is a legal transport for NFSv4.1 (see
Section 2.9).
3.3.10. state_owner4 3.3.10. state_owner4
struct state_owner4 { struct state_owner4 {
clientid4 clientid; clientid4 clientid;
opaque owner<NFS4_OPAQUE_LIMIT>; opaque owner<NFS4_OPAQUE_LIMIT>;
}; };
typedef state_owner4 open_owner4; typedef state_owner4 open_owner4;
typedef state_owner4 lock_owner4; typedef state_owner4 lock_owner4;
skipping to change at page 86, line 44 skipping to change at page 86, line 33
The layouttype4 data type is 32 bits in length. The range The layouttype4 data type is 32 bits in length. The range
represented by the layout type is split into three parts. Type 0x0 represented by the layout type is split into three parts. Type 0x0
is reserved. Types within the range 0x00000001-0x7FFFFFFF are is reserved. Types within the range 0x00000001-0x7FFFFFFF are
globally unique and are assigned according to the description in globally unique and are assigned according to the description in
Section 22.4; they are maintained by IANA. Types within the range Section 22.4; they are maintained by IANA. Types within the range
0x80000000-0xFFFFFFFF are site specific and for private use only. 0x80000000-0xFFFFFFFF are site specific and for private use only.
The LAYOUT4_NFSV4_1_FILES enumeration specifies that the NFSv4.1 file The LAYOUT4_NFSV4_1_FILES enumeration specifies that the NFSv4.1 file
layout type, as defined in Section 13, is to be used. The layout type, as defined in Section 13, is to be used. The
LAYOUT4_OSD2_OBJECTS enumeration specifies that the object layout, as LAYOUT4_OSD2_OBJECTS enumeration specifies that the object layout, as
defined in [30], is to be used. Similarly, the LAYOUT4_BLOCK_VOLUME defined in [29], is to be used. Similarly, the LAYOUT4_BLOCK_VOLUME
enumeration specifies that the block/volume layout, as defined in enumeration specifies that the block/volume layout, as defined in
[31], is to be used. [30], is to be used.
3.3.14. deviceid4 3.3.14. deviceid4
const NFS4_DEVICEID4_SIZE = 16; const NFS4_DEVICEID4_SIZE = 16;
typedef opaque deviceid4[NFS4_DEVICEID4_SIZE]; typedef opaque deviceid4[NFS4_DEVICEID4_SIZE];
Layout information includes device IDs that specify a storage device Layout information includes device IDs that specify a storage device
through a compact handle. Addressing and type information is through a compact handle. Addressing and type information is
obtained with the GETDEVICEINFO operation. Device IDs are not obtained with the GETDEVICEINFO operation. Device IDs are not
guaranteed to be valid across metadata server restarts. A device ID guaranteed to be valid across metadata server restarts. A device ID
is unique per client ID and layout type. See Section 12.2.10 for is unique per client ID and layout type. See Section 12.2.10 for
more details. more details.
3.3.15. device_addr4 3.3.15. device_addr4
struct device_addr4 { struct device_addr4 {
skipping to change at page 90, line 50 skipping to change at page 90, line 50
for a file system object. The contents of the filehandle are opaque for a file system object. The contents of the filehandle are opaque
to the client. Therefore, the server is responsible for translating to the client. Therefore, the server is responsible for translating
the filehandle to an internal representation of the file system the filehandle to an internal representation of the file system
object. object.
4.1. Obtaining the First Filehandle 4.1. Obtaining the First Filehandle
The operations of the NFS protocol are defined in terms of one or The operations of the NFS protocol are defined in terms of one or
more filehandles. Therefore, the client needs a filehandle to more filehandles. Therefore, the client needs a filehandle to
initiate communication with the server. With the NFSv3 protocol initiate communication with the server. With the NFSv3 protocol
RFC1813 [22], there exists an ancillary protocol to obtain this first RFC1813 [21], there exists an ancillary protocol to obtain this first
filehandle. The MOUNT protocol, RPC program number 100005, provides filehandle. The MOUNT protocol, RPC program number 100005, provides
the mechanism of translating a string based file system path name to the mechanism of translating a string based file system path name to
a filehandle which can then be used by the NFS protocols. a filehandle which can then be used by the NFS protocols.
The MOUNT protocol has deficiencies in the area of security and use The MOUNT protocol has deficiencies in the area of security and use
via firewalls. This is one reason that the use of the public via firewalls. This is one reason that the use of the public
filehandle was introduced in RFC2054 [32] and RFC2055 [33]. With the filehandle was introduced in RFC2054 [31] and RFC2055 [32]. With the
use of the public filehandle in combination with the LOOKUP operation use of the public filehandle in combination with the LOOKUP operation
in the NFSv3 protocol, it has been demonstrated that the MOUNT in the NFSv3 protocol, it has been demonstrated that the MOUNT
protocol is unnecessary for viable interaction between NFS client and protocol is unnecessary for viable interaction between NFS client and
server. server.
Therefore, the NFSv4.1 protocol will not use an ancillary protocol Therefore, the NFSv4.1 protocol will not use an ancillary protocol
for translation from string based path names to a filehandle. Two for translation from string based path names to a filehandle. Two
special filehandles will be used as starting points for the NFS special filehandles will be used as starting points for the NFS
client. client.
skipping to change at page 94, line 31 skipping to change at page 94, line 31
Servers which provide volatile filehandles that may expire while open Servers which provide volatile filehandles that may expire while open
(i.e. if FH4_VOL_MIGRATION or FH4_VOL_RENAME is set or if (i.e. if FH4_VOL_MIGRATION or FH4_VOL_RENAME is set or if
FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set), should FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set), should
deny a RENAME or REMOVE that would affect an OPEN file of any of the deny a RENAME or REMOVE that would affect an OPEN file of any of the
components leading to the OPEN file. In addition, the server should components leading to the OPEN file. In addition, the server should
deny all RENAME or REMOVE requests during the grace period upon deny all RENAME or REMOVE requests during the grace period upon
server restart. server restart.
Servers which provide volatile filehandles that may expire while open Servers which provide volatile filehandles that may expire while open
require special care as regards handling of RENAMESs and REMOVEs. require special care as regards handling of RENAMEs and REMOVEs.
This situation can arise if FH4_VOL_MIGRATION or FH4_VOL_RENAME is This situation can arise if FH4_VOL_MIGRATION or FH4_VOL_RENAME is
set, if FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set, set, if FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set,
or if a non-readonly file system has a transition target in a or if a non-readonly file system has a transition target in a
different _handle _ class. In these cases, the server should deny a different _handle _ class. In these cases, the server should deny a
RENAME or REMOVE that would affect an OPEN file of any of the RENAME or REMOVE that would affect an OPEN file of any of the
components leading to the OPEN file. In addition, the server should components leading to the OPEN file. In addition, the server should
deny all RENAME or REMOVE requests during the grace period, in order deny all RENAME or REMOVE requests during the grace period, in order
to make sure that reclaims of files where filehandles may have to make sure that reclaims of files where filehandles may have
expired do not do a reclaim for the wrong file. expired do not do a reclaim for the wrong file.
skipping to change at page 105, line 16 skipping to change at page 105, line 16
True, if two distinct filehandles guaranteed to refer to two True, if two distinct filehandles guaranteed to refer to two
different file system objects. different file system objects.
5.8.1.11. Attribute 10: lease_time 5.8.1.11. Attribute 10: lease_time
Duration of leases at server in seconds. Duration of leases at server in seconds.
5.8.1.12. Attribute 11: rdattr_error 5.8.1.12. Attribute 11: rdattr_error
Error returned from getattr during readdir. Error returned from an attempt to retrieve attributes during a
READDIR operation.
5.8.1.13. Attribute 19: filehandle 5.8.1.13. Attribute 19: filehandle
The filehandle of this object (primarily for readdir requests). The filehandle of this object (primarily for READDIR requests).
5.8.1.14. Attribute 75: suppattr_exclcreat 5.8.1.14. Attribute 75: suppattr_exclcreat
The bit vector which would set all REQUIRED and RECOMMENDED The bit vector which would set all REQUIRED and RECOMMENDED
attributes that are supported by the EXCLUSIVE4_1 method of file attributes that are supported by the EXCLUSIVE4_1 method of file
creation via the OPEN operation. The scope of this attribute applies creation via the OPEN operation. The scope of this attribute applies
to all objects with a matching fsid. to all objects with a matching fsid.
5.8.2. Definitions of Uncategorized RECOMMENDED Attributes 5.8.2. Definitions of Uncategorized RECOMMENDED Attributes
skipping to change at page 112, line 15 skipping to change at page 112, line 15
5.8.2.44. Attribute 54: time_modify_set 5.8.2.44. Attribute 54: time_modify_set
Set the time of last modification to the object. SETATTR use only. Set the time of last modification to the object. SETATTR use only.
5.9. Interpreting owner and owner_group 5.9. Interpreting owner and owner_group
The RECOMMENDED attributes "owner" and "owner_group" (and also users The RECOMMENDED attributes "owner" and "owner_group" (and also users
and groups within the "acl" attribute) are represented in terms of a and groups within the "acl" attribute) are represented in terms of a
UTF-8 string. To avoid a representation that is tied to a particular UTF-8 string. To avoid a representation that is tied to a particular
underlying implementation at the client or server, the use of the underlying implementation at the client or server, the use of the
UTF-8 string has been chosen. Note that section 6.1 of RFC2624 [34] UTF-8 string has been chosen. Note that section 6.1 of RFC2624 [33]
provides additional rationale. It is expected that the client and provides additional rationale. It is expected that the client and
server will have their own local representation of owner and server will have their own local representation of owner and
owner_group that is used for local storage or presentation to the end owner_group that is used for local storage or presentation to the end
user. Therefore, it is expected that when these attributes are user. Therefore, it is expected that when these attributes are
transferred between the client and server that the local transferred between the client and server that the local
representation is translated to a syntax of the form "user@ representation is translated to a syntax of the form "user@
dns_domain". This will allow for a client and server that do not use dns_domain". This will allow for a client and server that do not use
the same local representation the ability to translate to a common the same local representation the ability to translate to a common
syntax that can be interpreted by both. syntax that can be interpreted by both.
skipping to change at page 114, line 9 skipping to change at page 114, line 9
compatibility. compatibility.
The owner string "nobody" may be used to designate an anonymous user, The owner string "nobody" may be used to designate an anonymous user,
which will be associated with a file created by a security principal which will be associated with a file created by a security principal
that cannot be mapped through normal means to the owner attribute. that cannot be mapped through normal means to the owner attribute.
5.10. Character Case Attributes 5.10. Character Case Attributes
With respect to the case_insensitive and case_preserving attributes, With respect to the case_insensitive and case_preserving attributes,
each UCS-4 character (which UTF-8 encodes) has a "long descriptive each UCS-4 character (which UTF-8 encodes) has a "long descriptive
name" RFC1345 [35] which may or may not include the word "CAPITAL" or name" RFC1345 [34] which may or may not include the word "CAPITAL" or
"SMALL". The presence of SMALL or CAPITAL allows an NFS server to "SMALL". The presence of SMALL or CAPITAL allows an NFS server to
implement unambiguous and efficient table driven mappings for case implement unambiguous and efficient table driven mappings for case
insensitive comparisons, and non-case-preserving storage. For insensitive comparisons, and non-case-preserving storage. For
general character handling and internationalization issues, see general character handling and internationalization issues, see
Section 14. Section 14.
5.11. Directory Notification Attributes 5.11. Directory Notification Attributes
As described in Section 18.39, the client can request a minimum delay As described in Section 18.39, the client can request a minimum delay
for notifications of changes to attributes, but the server is free to for notifications of changes to attributes, but the server is free to
skipping to change at page 132, line 28 skipping to change at page 132, line 28
this is true even if the parent or target explicitly denies one of this is true even if the parent or target explicitly denies one of
these permissions.) these permissions.)
If the ACLs in question neither explicitly ALLOW nor DENY either of If the ACLs in question neither explicitly ALLOW nor DENY either of
the above, and if MODE4_SVTX is not set on the parent, then the the above, and if MODE4_SVTX is not set on the parent, then the
server SHOULD allow the removal if and only if ACE4_ADD_FILE is server SHOULD allow the removal if and only if ACE4_ADD_FILE is
permitted. In the case where MODE4_SVTX is set, the server may also permitted. In the case where MODE4_SVTX is set, the server may also
require the remover to own either the parent or the target, or may require the remover to own either the parent or the target, or may
require the target to be writable. require the target to be writable.
This allows servers to support something close to traditional unix- This allows servers to support something close to traditional UNIX-
like semantics, with ACE4_ADD_FILE taking the place of the write bit. like semantics, with ACE4_ADD_FILE taking the place of the write bit.
6.2.1.4. ACE flag 6.2.1.4. ACE flag
The bitmask constants used for the flag field are as follows: The bitmask constants used for the flag field are as follows:
const ACE4_FILE_INHERIT_ACE = 0x00000001; const ACE4_FILE_INHERIT_ACE = 0x00000001;
const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002;
const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004;
const ACE4_INHERIT_ONLY_ACE = 0x00000008; const ACE4_INHERIT_ONLY_ACE = 0x00000008;
skipping to change at page 139, line 37 skipping to change at page 139, line 37
behaviors specified with "SHOULD". This is intentional, to avoid behaviors specified with "SHOULD". This is intentional, to avoid
invalidating existing implementations that compute the mode according invalidating existing implementations that compute the mode according
to the withdrawn POSIX ACL draft (1003.1e draft 17), rather than by to the withdrawn POSIX ACL draft (1003.1e draft 17), rather than by
actual permissions on owner, group, and other. actual permissions on owner, group, and other.
6.4.1. Setting the mode and/or ACL Attributes 6.4.1. Setting the mode and/or ACL Attributes
In the case where a server supports the sacl or dacl attribute, in In the case where a server supports the sacl or dacl attribute, in
addition to the acl attribute, the server MUST fail a request to set addition to the acl attribute, the server MUST fail a request to set
the acl attribute simultaneously with a dacl or sacl attribute. The the acl attribute simultaneously with a dacl or sacl attribute. The
error to be given is NFS4ERR_ATTRNOTSUP. error to be given is NFS4ERR_ATTRNOTSUPP.
6.4.1.1. Setting mode and not ACL 6.4.1.1. Setting mode and not ACL
When any of the nine low-order mode bits are subject to change, When any of the nine low-order mode bits are subject to change,
either because the mode attribute was set or because the either because the mode attribute was set or because the
mode_set_masked attribute was set and the mask included one or more mode_set_masked attribute was set and the mask included one or more
bits from the nine low-order mode bits, and no ACL attribute is bits from the nine low-order mode bits, and no ACL attribute is
explicitly set, the acl and dacl attributes must be modified in explicitly set, the acl and dacl attributes must be modified in
accordance with the updated value of those bits. This must happen accordance with the updated value of those bits. This must happen
even if the value of the low-order bits is the same after the mode is even if the value of the low-order bits is the same after the mode is
skipping to change at page 143, line 49 skipping to change at page 143, line 49
and all other bits must be cleared. The ACE4_INHERITED_ACE flag may and all other bits must be cleared. The ACE4_INHERITED_ACE flag may
be set in the ACEs of the sacl or dacl (whereas it must always be be set in the ACEs of the sacl or dacl (whereas it must always be
cleared in the acl). cleared in the acl).
Together these features allow a server to support automatic Together these features allow a server to support automatic
inheritance, which we now explain in more detail. inheritance, which we now explain in more detail.
Inheritable ACEs are normally inherited by child objects only at the Inheritable ACEs are normally inherited by child objects only at the
time that the child objects are created; later modifications to time that the child objects are created; later modifications to
inheritable ACEs do not result in modifications to inherited ACEs on inheritable ACEs do not result in modifications to inherited ACEs on
descendents. descendants.
However, the dacl and sacl provide an OPTIONAL mechanism which allows However, the dacl and sacl provide an OPTIONAL mechanism which allows
a client application to propagate changes to inheritable ACEs to an a client application to propagate changes to inheritable ACEs to an
entire directory hierarchy. entire directory hierarchy.
A server that supports this performs inheritance at object creation A server that supports this performs inheritance at object creation
time in the normal way, and SHOULD set the ACE4_INHERITED_ACE flag on time in the normal way, and SHOULD set the ACE4_INHERITED_ACE flag on
any inherited ACEs as they are added to the new object. any inherited ACEs as they are added to the new object.
A client application such as an ACL editor may then propagate changes A client application such as an ACL editor may then propagate changes
skipping to change at page 149, line 43 skipping to change at page 149, line 43
clients should use strong security mechanisms to access the pseudo clients should use strong security mechanisms to access the pseudo
file system in order to prevent man-in-the-middle attacks. file system in order to prevent man-in-the-middle attacks.
8. State Management 8. State Management
Integrating locking into the NFS protocol necessarily causes it to be Integrating locking into the NFS protocol necessarily causes it to be
stateful. With the inclusion of such features as share reservations, stateful. With the inclusion of such features as share reservations,
file and directory delegations, recallable layouts, and support for file and directory delegations, recallable layouts, and support for
mandatory byte-range locking, the protocol becomes substantially more mandatory byte-range locking, the protocol becomes substantially more
dependent on proper management of state than the traditional dependent on proper management of state than the traditional
combination of NFS and NLM [36]. These features include expanded combination of NFS and NLM [35]. These features include expanded
locking facilities, which provide some measure of interclient locking facilities, which provide some measure of interclient
exclusion, but the state also offers features not readily providable exclusion, but the state also offers features not readily providable
using a stateless model. There are three components to making this using a stateless model. There are three components to making this
state manageable: state manageable:
o Clear division between client and server o Clear division between client and server
o Ability to reliably detect inconsistency in state between client o Ability to reliably detect inconsistency in state between client
and server and server
o Simple and robust recovery mechanisms o Simple and robust recovery mechanisms
skipping to change at page 166, line 22 skipping to change at page 166, line 22
requests to be processed during the grace period, it MUST determine requests to be processed during the grace period, it MUST determine
that no lock subsequently reclaimed will be rejected and that no lock that no lock subsequently reclaimed will be rejected and that no lock
subsequently reclaimed would have prevented any I/O operation subsequently reclaimed would have prevented any I/O operation
processed during the grace period. processed during the grace period.
Clients should be prepared for the return of NFS4ERR_GRACE errors for Clients should be prepared for the return of NFS4ERR_GRACE errors for
non-reclaim lock and I/O requests. In this case the client should non-reclaim lock and I/O requests. In this case the client should
employ a retry mechanism for the request. A delay (on the order of employ a retry mechanism for the request. A delay (on the order of
several seconds) between retries should be used to avoid overwhelming several seconds) between retries should be used to avoid overwhelming
the server. Further discussion of the general issue is included in the server. Further discussion of the general issue is included in
[37]. The client must account for the server that can perform I/O [36]. The client must account for the server that can perform I/O
and non-reclaim locking requests within the grace period as well as and non-reclaim locking requests within the grace period as well as
those that cannot do so. those that cannot do so.
A reclaim-type locking request outside the server's grace period can A reclaim-type locking request outside the server's grace period can
only succeed if the server can guarantee that no conflicting lock or only succeed if the server can guarantee that no conflicting lock or
I/O request has been granted since restart. I/O request has been granted since restart.
A server may, upon restart, establish a new value for the lease A server may, upon restart, establish a new value for the lease
period. Therefore, clients should, once a new client ID is period. Therefore, clients should, once a new client ID is
established, refetch the lease_time attribute and use it as the basis established, refetch the lease_time attribute and use it as the basis
skipping to change at page 173, line 9 skipping to change at page 173, line 9
well as the possibility that requests will be lost and need to be well as the possibility that requests will be lost and need to be
retransmitted. retransmitted.
To take propagation delay into account, the client should subtract it To take propagation delay into account, the client should subtract it
from lease times (e.g. if the client estimates the one-way from lease times (e.g. if the client estimates the one-way
propagation delay as 200 milliseconds, then it can assume that the propagation delay as 200 milliseconds, then it can assume that the
lease is already 200 milliseconds old when it gets it). In addition, lease is already 200 milliseconds old when it gets it). In addition,
it will take another 200 milliseconds to get a response back to the it will take another 200 milliseconds to get a response back to the
server. So the client must send a lease renewal or write data back server. So the client must send a lease renewal or write data back
to the server at least 400 milliseconds before the lease would to the server at least 400 milliseconds before the lease would
expire. expire. If the propagation delay varies over the life of the lease
(e.g. the client is on a mobile host), the client will need to
continuously subtract the increase in propagation delay from the
lease times.
The server's lease period configuration should take into account the The server's lease period configuration should take into account the
network distance of the clients that will be accessing the server's network distance of the clients that will be accessing the server's
resources. It is expected that the lease period will take into resources. It is expected that the lease period will take into
account the network propagation delays and other network delay account the network propagation delays and other network delay
factors for the client population. Since the protocol does not allow factors for the client population. Since the protocol does not allow
for an automatic method to determine an appropriate lease period, the for an automatic method to determine an appropriate lease period, the
server's administrator may have to tune the lease period. server's administrator may have to tune the lease period.
8.8. Obsolete Locking Infrastructure From NFSv4.0 8.8. Obsolete Locking Infrastructure From NFSv4.0
skipping to change at page 187, line 46 skipping to change at page 187, line 46
o For WRITE, see Section 18.32.4. o For WRITE, see Section 18.32.4.
On recall, the client holding the delegation must flush modified On recall, the client holding the delegation must flush modified
state (such as modified data) to the server and return the state (such as modified data) to the server and return the
delegation. The conflicting request will not be acted on until the delegation. The conflicting request will not be acted on until the
recall is complete. The recall is considered complete when the recall is complete. The recall is considered complete when the
client returns the delegation or the server times its wait for the client returns the delegation or the server times its wait for the
delegation to be returned and revokes the delegation as a result of delegation to be returned and revokes the delegation as a result of
the timeout. In the interim, the server will either delay responding the timeout. In the interim, the server will either delay responding
to conflicting requests or respond to them with NFSERR_DELAY. to conflicting requests or respond to them with NFS4ERR_DELAY.
Following the resolution of the recall, the server has the Following the resolution of the recall, the server has the
information necessary to grant or deny the second client's request. information necessary to grant or deny the second client's request.
At the time the client receives a delegation recall, it may have At the time the client receives a delegation recall, it may have
substantial state that needs to be flushed to the server. Therefore, substantial state that needs to be flushed to the server. Therefore,
the server should allow sufficient time for the delegation to be the server should allow sufficient time for the delegation to be
returned since it may involve numerous RPCs to the server. If the returned since it may involve numerous RPCs to the server. If the
server is able to determine that the client is diligently flushing server is able to determine that the client is diligently flushing
state to the server as a result of the recall, the server may extend state to the server as a result of the recall, the server may extend
the usual time allowed for a recall. However, the time allowed for the usual time allowed for a recall. However, the time allowed for
skipping to change at page 190, line 19 skipping to change at page 190, line 19
to the behavior for locks and share reservations. For delegations, to the behavior for locks and share reservations. For delegations,
however, the server may extend the period in which conflicting however, the server may extend the period in which conflicting
requests are held off. Eventually the occurrence of a conflicting requests are held off. Eventually the occurrence of a conflicting
request from another client will cause revocation of the delegation. request from another client will cause revocation of the delegation.
A loss of the backchannel (e.g. by later network configuration A loss of the backchannel (e.g. by later network configuration
change) will have the same effect. A recall request will fail and change) will have the same effect. A recall request will fail and
revocation of the delegation will result. revocation of the delegation will result.
A client normally finds out about revocation of a delegation when it A client normally finds out about revocation of a delegation when it
uses a stateid associated with a delegation and receives one of the uses a stateid associated with a delegation and receives one of the
errors NFS4EER_EXPIRED, NFS4ERR_ADMIN_REVOKED, or errors NFS4ERR_EXPIRED, NFS4ERR_ADMIN_REVOKED, or
NFS4ERR_DELEG_REVOKED. It also may find out about delegation NFS4ERR_DELEG_REVOKED. It also may find out about delegation
revocation after a client restart when it attempts to reclaim a revocation after a client restart when it attempts to reclaim a
delegation and receives that same error. Note that in the case of a delegation and receives that same error. Note that in the case of a
revoked write open delegation, there are issues because data may have revoked write open delegation, there are issues because data may have
been modified by the client whose delegation is revoked and been modified by the client whose delegation is revoked and
separately by other clients. See Section 10.5.1 for a discussion of separately by other clients. See Section 10.5.1 for a discussion of
such issues. Note also that when delegations are revoked, such issues. Note also that when delegations are revoked,
information about the revoked delegation will be written by the information about the revoked delegation will be written by the
server to stable storage (as described in Section 8.4.3). This is server to stable storage (as described in Section 8.4.3). This is
done to deal with the case in which a server restarts after revoking done to deal with the case in which a server restarts after revoking
skipping to change at page 233, line 33 skipping to change at page 233, line 33
11.7.5.1. File System Splitting 11.7.5.1. File System Splitting
When a file system transition is made and the fs_locations_info When a file system transition is made and the fs_locations_info
indicates that the file system in question may be split into multiple indicates that the file system in question may be split into multiple
file systems (via the FSLI4F_MULTI_FS flag), the client SHOULD do file systems (via the FSLI4F_MULTI_FS flag), the client SHOULD do
GETATTRs to determine the fsid attribute on all known objects within GETATTRs to determine the fsid attribute on all known objects within
the file system undergoing transition to determine the new file the file system undergoing transition to determine the new file
system boundaries. system boundaries.
Clients may maintain the fsids passed to existing applications by Clients may maintain the fsids passed to existing applications by
mapping all of the fsids for the descendent file systems to the mapping all of the fsids for the descendant file systems to the
common fsid used for the original file system. common fsid used for the original file system.
Splitting a file system may be done on a transition between file Splitting a file system may be done on a transition between file
systems of the same _fileid_ class, since the fact that fileids are systems of the same _fileid_ class, since the fact that fileids are
unique within the source file system ensure they will be unique in unique within the source file system ensure they will be unique in
each of the target file systems. each of the target file systems.
11.7.6. The Change Attribute and File System Transitions 11.7.6. The Change Attribute and File System Transitions
Since the change attribute is defined as a server-specific one, Since the change attribute is defined as a server-specific one,
skipping to change at page 260, line 40 skipping to change at page 260, line 40
expected to be used in line with industry practice. expected to be used in line with industry practice.
The variable ${ietf.org:OS_TYPE} is used to denote the operating The variable ${ietf.org:OS_TYPE} is used to denote the operating
system and thus the kernel and library API's for which code might be system and thus the kernel and library API's for which code might be
compiled. This specification does not limit the acceptable values compiled. This specification does not limit the acceptable values
(except that they must be valid UTF-8 strings) but such values as (except that they must be valid UTF-8 strings) but such values as
"linux" and "freebsd" would be expected to be used in line with "linux" and "freebsd" would be expected to be used in line with
industry practice. industry practice.
The variable ${ietf.org:OS_VERSION} is used to denote the operating The variable ${ietf.org:OS_VERSION} is used to denote the operating
system version and the thus the specific details of versioned system version and thus the specific details of versioned interfaces
interfaces for which code might be compiled. This specification does for which code might be compiled. This specification does not limit
not limit the acceptable values (except that they must be valid UTF-8 the acceptable values (except that they must be valid UTF-8 strings)
strings) but combinations of numbers and letters with interspersed but combinations of numbers and letters with interspersed dots would
dots would be expected to be used in line with industry practice, be expected to be used in line with industry practice, with the
with the details of the version format depending on the specific details of the version format depending on the specific value of the
value of the value of the variable ${ietf.org:OS_TYPE} with which it value of the variable ${ietf.org:OS_TYPE} with which it is used.
is used.
Use of these variable could result in direction of different clients Use of these variable could result in direction of different clients
to different file systems on the same server, as appropriate to to different file systems on the same server, as appropriate to
particular clients. In cases in which the target file systems are particular clients. In cases in which the target file systems are
located on different servers, a single server could serve as a located on different servers, a single server could serve as a
referral point so that each valid combination of variable values referral point so that each valid combination of variable values
would designate a referral hosted on a single server, with the would designate a referral hosted on a single server, with the
targets of those referrals on a number of different servers. targets of those referrals on a number of different servers.
Because namespace administration is affected by the values selected Because namespace administration is affected by the values selected
skipping to change at page 266, line 30 skipping to change at page 266, line 30
The NFSv4.1 pNFS feature has been structured to allow for a variety The NFSv4.1 pNFS feature has been structured to allow for a variety
of storage protocols to be defined and used. As noted in the diagram of storage protocols to be defined and used. As noted in the diagram
above, the storage protocol is the method used by the client to store above, the storage protocol is the method used by the client to store
and retrieve data directly from the storage devices. The NFSv4.1 and retrieve data directly from the storage devices. The NFSv4.1
protocol directly defines one storage protocol, the NFSv4.1 storage protocol directly defines one storage protocol, the NFSv4.1 storage
type, and its use. type, and its use.
Examples of other storage protocols that could be used with NFSv4.1's Examples of other storage protocols that could be used with NFSv4.1's
pNFS are: pNFS are:
o Block/volume protocols such as iSCSI ([38]), and FCP ([39]). The o Block/volume protocols such as iSCSI ([37]), and FCP ([38]). The
block/volume protocol support can be independent of the addressing block/volume protocol support can be independent of the addressing
structure of the block/volume protocol used, allowing more than structure of the block/volume protocol used, allowing more than
one protocol to access the same file data and enabling one protocol to access the same file data and enabling
extensibility to other block/volume protocols. extensibility to other block/volume protocols.
o Object protocols such as OSD over iSCSI or Fibre Channel [40]. o Object protocols such as OSD over iSCSI or Fibre Channel [39].
o Other storage protocols, including PVFS and other file systems o Other storage protocols, including PVFS and other file systems
that are in use in HPC environments. that are in use in HPC environments.
It is possible that various storage protocols are available to both It is possible that various storage protocols are available to both
client and server and it may be possible that a client and server do client and server and it may be possible that a client and server do
not have a matching storage protocol available to them. Because of not have a matching storage protocol available to them. Because of
this, the pNFS server MUST support normal NFSv4.1 access to any file this, the pNFS server MUST support normal NFSv4.1 access to any file
accessible by the pNFS feature; this will allow for continued accessible by the pNFS feature; this will allow for continued
interoperability between an NFSv4.1 client and server. interoperability between an NFSv4.1 client and server.
skipping to change at page 268, line 31 skipping to change at page 268, line 31
requirements are placed on the control protocol for maintaining requirements are placed on the control protocol for maintaining
attributes like modify time, the change attribute, and the end-of- attributes like modify time, the change attribute, and the end-of-
file (EOF) position. file (EOF) position.
12.2.7. Layout Types 12.2.7. Layout Types
A layout describes the mapping of a file's data to the storage A layout describes the mapping of a file's data to the storage
devices that hold the data. A layout is said to belong to a specific devices that hold the data. A layout is said to belong to a specific
layout type (data type layouttype4, see Section 3.3.13). The layout layout type (data type layouttype4, see Section 3.3.13). The layout
type allows for variants to handle different storage protocols, such type allows for variants to handle different storage protocols, such
as those associated with block/volume [31], object [30], and file as those associated with block/volume [30], object [29], and file
(Section 13) layout types. A metadata server, along with its control (Section 13) layout types. A metadata server, along with its control
protocol, MUST support at least one layout type. A private sub-range protocol, MUST support at least one layout type. A private sub-range
of the layout type name space is also defined. Values from the of the layout type name space is also defined. Values from the
private layout type range MAY be used for internal testing or private layout type range MAY be used for internal testing or
experimentation. experimentation.
As an example, the organization of the file layout type could be an As an example, the organization of the file layout type could be an
array of tuples (e.g., device ID, filehandle), along with a array of tuples (e.g., device ID, filehandle), along with a
definition of how the data is stored across the devices (e.g., definition of how the data is stored across the devices (e.g.,
striping). A block/volume layout might be an array of tuples that striping). A block/volume layout might be an array of tuples that
skipping to change at page 273, line 14 skipping to change at page 273, line 14
which a layout is held, does not necessarily conflict with the which a layout is held, does not necessarily conflict with the
holding of the layout that describes the file being modified. holding of the layout that describes the file being modified.
Therefore, it is the requirement of the storage protocol or layout Therefore, it is the requirement of the storage protocol or layout
type that determines the necessary behavior. For example, block/ type that determines the necessary behavior. For example, block/
volume layout types require that the layout's iomode agree with the volume layout types require that the layout's iomode agree with the
type of I/O being performed. type of I/O being performed.
Depending upon the layout type and storage protocol in use, storage Depending upon the layout type and storage protocol in use, storage
device access permissions may be granted by LAYOUTGET and may be device access permissions may be granted by LAYOUTGET and may be
encoded within the type-specific layout. For an example of storage encoded within the type-specific layout. For an example of storage
device access permissions see an object based protocol such as [40]. device access permissions see an object based protocol such as [39].
If access permissions are encoded within the layout, the metadata If access permissions are encoded within the layout, the metadata
server SHOULD recall the layout when those permissions become invalid server SHOULD recall the layout when those permissions become invalid
for any reason; for example when a file becomes unwritable or for any reason; for example when a file becomes unwritable or
inaccessible to a client. Note, clients are still required to inaccessible to a client. Note, clients are still required to
perform the appropriate access operations with open, lock and access perform the appropriate access operations with open, lock and access
as described above. The degree to which it is possible for the as described above. The degree to which it is possible for the
client to circumvent these access operations and the consequences of client to circumvent these access operations and the consequences of
doing so must be clearly specified by the individual layout type doing so must be clearly specified by the individual layout type
specifications. In addition, these specifications must be clear specifications. In addition, these specifications must be clear
about the requirements and non-requirements for the checking about the requirements and non-requirements for the checking
skipping to change at page 274, line 8 skipping to change at page 274, line 8
multiple LAYOUTGET requests; these might result in multiple multiple LAYOUTGET requests; these might result in multiple
overlapping, non-conflicting layouts (see Section 12.2.8). overlapping, non-conflicting layouts (see Section 12.2.8).
In order to get a layout, the client must first have opened the file In order to get a layout, the client must first have opened the file
via the OPEN operation. When a client has no layout on a file, it via the OPEN operation. When a client has no layout on a file, it
MUST present a stateid as returned by OPEN, a delegation stateid, or MUST present a stateid as returned by OPEN, a delegation stateid, or
a byte-range lock stateid in the loga_stateid argument. A successful a byte-range lock stateid in the loga_stateid argument. A successful
LAYOUTGET result includes a layout stateid. The first successful LAYOUTGET result includes a layout stateid. The first successful
LAYOUTGET processed by the server using a non-layout stateid as an LAYOUTGET processed by the server using a non-layout stateid as an
argument MUST have the "seqid" field of the layout stateid in the argument MUST have the "seqid" field of the layout stateid in the
response set to one. Thereafter, the client uses a layout stateid response set to one. Thereafter, the client MUST use a layout
(see Section 12.5.3) on future invocations of LAYOUTGET on the file, stateid (see Section 12.5.3) on future invocations of LAYOUTGET on
and the "seqid" MUST NOT be set to zero. Once the layout has been the file, and the "seqid" MUST NOT be set to zero. Once the layout
retrieved, it can be held across multiple OPEN and CLOSE sequences. has been retrieved, it can be held across multiple OPEN and CLOSE
Therefore, a client may hold a layout for a file that is not sequences. Therefore, a client may hold a layout for a file that is
currently open by any user on the client. This allows for the not currently open by any user on the client. This allows for the
caching of layouts beyond CLOSE. caching of layouts beyond CLOSE.
The storage protocol used by the client to access the data on the The storage protocol used by the client to access the data on the
storage device is determined by the layout's type. The client is storage device is determined by the layout's type. The client is
responsible for matching the layout type with an available method to responsible for matching the layout type with an available method to
interpret and use the layout. The method for this layout type interpret and use the layout. The method for this layout type
selection is outside the scope of the pNFS functionality. selection is outside the scope of the pNFS functionality.
Although the metadata server is in control of the layout for a file, Although the metadata server is in control of the layout for a file,
the pNFS client can provide hints to the server when a file is opened the pNFS client can provide hints to the server when a file is opened
skipping to change at page 295, line 18 skipping to change at page 295, line 18
threats are considered significant. threats are considered significant.
In some cases, the security countermeasures for connections to In some cases, the security countermeasures for connections to
storage devices may take the form of physical isolation or a storage devices may take the form of physical isolation or a
recommendation not to use pNFS in an environment. For example, it recommendation not to use pNFS in an environment. For example, it
may be impractical to provide confidentiality protection for some may be impractical to provide confidentiality protection for some
storage protocols to protect against eavesdropping; in environments storage protocols to protect against eavesdropping; in environments
where eavesdropping on such protocols is of sufficient concern to where eavesdropping on such protocols is of sufficient concern to
require countermeasures, physical isolation of the communication require countermeasures, physical isolation of the communication
channel (e.g., via direct connection from client(s) to storage channel (e.g., via direct connection from client(s) to storage
device(s)) and/or a decision to forego use of pNFS (e.g., and fall device(s)) and/or a decision to forgo use of pNFS (e.g., and fall
back to conventional NFSv4.1) may be appropriate courses of action. back to conventional NFSv4.1) may be appropriate courses of action.
Where communication with storage devices is subject to the same Where communication with storage devices is subject to the same
threats as client to metadata server communication, the protocols threats as client to metadata server communication, the protocols
used for that communication need to provide security mechanisms as used for that communication need to provide security mechanisms as
strong as or no weaker than those available via RPSEC_GSS for strong as or no weaker than those available via RPCSEC_GSS for
NFSv4.1. NFSv4.1. Except for the storage protocol used for the
LAYOUT4_NFSV4_1_FILES layout (see Section 13), i.e. except for
NFSv4.1, it is beyond the scope of this document to specify the
security mechanisms for storage access protocols.
pNFS implementations MUST NOT remove NFSv4.1's access controls. The pNFS implementations MUST NOT remove NFSv4.1's access controls. The
combination of clients, storage devices, and the metadata server are combination of clients, storage devices, and the metadata server are
responsible for ensuring that all client to storage device file data responsible for ensuring that all client to storage device file data
access respects NFSv4.1's ACLs and file open modes. This entails access respects NFSv4.1's ACLs and file open modes. This entails
performing both of these checks on every access in the client, the performing both of these checks on every access in the client, the
storage device, or both (as applicable; when the storage device is an storage device, or both (as applicable; when the storage device is an
NFSv4.1 server, the storage device is ultimately responsible for NFSv4.1 server, the storage device is ultimately responsible for
controlling access). If a pNFS configuration performs these checks controlling access). If a pNFS configuration performs these checks
only in the client, the risk of a misbehaving client obtaining only in the client, the risk of a misbehaving client obtaining
skipping to change at page 327, line 52 skipping to change at page 327, line 52
| NFS4ERR_BAD_STATEID | 10025 | Section 15.1.5.2 | | NFS4ERR_BAD_STATEID | 10025 | Section 15.1.5.2 |
| NFS4ERR_CB_PATH_DOWN | 10048 | Section 15.1.11.4 | | NFS4ERR_CB_PATH_DOWN | 10048 | Section 15.1.11.4 |
| NFS4ERR_CLID_INUSE | 10017 | Section 15.1.13.2 | | NFS4ERR_CLID_INUSE | 10017 | Section 15.1.13.2 |
| NFS4ERR_CLIENTID_BUSY | 10074 | Section 15.1.13.1 | | NFS4ERR_CLIENTID_BUSY | 10074 | Section 15.1.13.1 |
| NFS4ERR_COMPLETE_ALREADY | 10054 | Section 15.1.9.1 | | NFS4ERR_COMPLETE_ALREADY | 10054 | Section 15.1.9.1 |
| NFS4ERR_CONN_NOT_BOUND_TO_SESSION | 10055 | Section 15.1.11.6 | | NFS4ERR_CONN_NOT_BOUND_TO_SESSION | 10055 | Section 15.1.11.6 |
| NFS4ERR_DEADLOCK | 10045 | Section 15.1.8.2 | | NFS4ERR_DEADLOCK | 10045 | Section 15.1.8.2 |
| NFS4ERR_DEADSESSION | 10078 | Section 15.1.11.5 | | NFS4ERR_DEADSESSION | 10078 | Section 15.1.11.5 |
| NFS4ERR_DELAY | 10008 | Section 15.1.1.3 | | NFS4ERR_DELAY | 10008 | Section 15.1.1.3 |
| NFS4ERR_DELEG_ALREADY_WANTED | 10056 | Section 15.1.14.1 | | NFS4ERR_DELEG_ALREADY_WANTED | 10056 | Section 15.1.14.1 |
| NFS4ERR_DELEG_REVOKED | 10087 | Section 15.1.5.3 |
| NFS4ERR_DENIED | 10010 | Section 15.1.8.3 | | NFS4ERR_DENIED | 10010 | Section 15.1.8.3 |
| NFS4ERR_DIRDELEG_UNAVAIL | 10084 | Section 15.1.14.2 | | NFS4ERR_DIRDELEG_UNAVAIL | 10084 | Section 15.1.14.2 |
| NFS4ERR_DQUOT | 69 | Section 15.1.4.2 | | NFS4ERR_DQUOT | 69 | Section 15.1.4.2 |
| NFS4ERR_ENCR_ALG_UNSUPP | 10079 | Section 15.1.13.3 | | NFS4ERR_ENCR_ALG_UNSUPP | 10079 | Section 15.1.13.3 |
| NFS4ERR_EXIST | 17 | Section 15.1.4.3 | | NFS4ERR_EXIST | 17 | Section 15.1.4.3 |
| NFS4ERR_EXPIRED | 10011 | Section 15.1.5.4 | | NFS4ERR_EXPIRED | 10011 | Section 15.1.5.4 |
| NFS4ERR_FBIG | 27 | Section 15.1.4.4 | | NFS4ERR_FBIG | 27 | Section 15.1.4.4 |
| NFS4ERR_FHEXPIRED | 10014 | Section 15.1.2.2 | | NFS4ERR_FHEXPIRED | 10014 | Section 15.1.2.2 |
| NFS4ERR_FILE_OPEN | 10046 | Section 15.1.4.5 | | NFS4ERR_FILE_OPEN | 10046 | Section 15.1.4.5 |
| NFS4ERR_GRACE | 10013 | Section 15.1.9.2 | | NFS4ERR_GRACE | 10013 | Section 15.1.9.2 |
skipping to change at page 336, line 33 skipping to change at page 336, line 33
A stateid designates locking state of any type that has been revoked A stateid designates locking state of any type that has been revoked
due to administrative interaction, possibly while the lease is valid. due to administrative interaction, possibly while the lease is valid.
15.1.5.2. NFS4ERR_BAD_STATEID (Error Code 10026) 15.1.5.2. NFS4ERR_BAD_STATEID (Error Code 10026)
A stateid does not properly designate any valid state. See A stateid does not properly designate any valid state. See
Section 8.2.4 and Section 8.2.3 for a discussion of how stateids are Section 8.2.4 and Section 8.2.3 for a discussion of how stateids are
validated. validated.
15.1.5.3. NFS4ERR_DELEG_REVOKED (Error Code 10056) 15.1.5.3. NFS4ERR_DELEG_REVOKED (Error Code 10087)
A stateid designates recallable locking state of any type that has A stateid designates recallable locking state of any type that has
been revoked due to the failure of the client to return the lock, been revoked due to the failure of the client to return the lock,
when it was recalled. when it was recalled.
15.1.5.4. NFS4ERR_EXPIRED (Error Code 10011) 15.1.5.4. NFS4ERR_EXPIRED (Error Code 10011)
A stateid designates locking state of any type that has been revoked A stateid designates locking state of any type that has been revoked
due to expiration of the client's lease, either immediately upon due to expiration of the client's lease, either immediately upon
lease expiration, or following a later request for a conflicting lease expiration, or following a later request for a conflicting
skipping to change at page 396, line 5 skipping to change at page 396, line 5
o When a client executes a regular file, it has to read the file o When a client executes a regular file, it has to read the file
from the server. Strictly speaking, the server should not allow from the server. Strictly speaking, the server should not allow
the client to read a file being executed unless the user has read the client to read a file being executed unless the user has read
permissions on the file. Requiring users and administers to set permissions on the file. Requiring users and administers to set
read permissions on executable files in order to access them over read permissions on executable files in order to access them over
NFS is not going to be acceptable to some people. Historically, NFS is not going to be acceptable to some people. Historically,
NFS servers have allowed a user to READ a file if the user has NFS servers have allowed a user to READ a file if the user has
execute access to the file. execute access to the file.
As a practical example, the UNIX specification [41] states that an As a practical example, the UNIX specification [40] states that an
implementation claiming conformance to UNIX may indicate in the implementation claiming conformance to UNIX may indicate in the
access() programming interface's result that a privileged user has access() programming interface's result that a privileged user has
execute rights, even if no execute permission bits are set on the execute rights, even if no execute permission bits are set on the
regular file's attributes. It is possible to claim conformance to regular file's attributes. It is possible to claim conformance to
the UNIX specification and instead not indicate execute rights in the UNIX specification and instead not indicate execute rights in
that situation, which is true for some operating environments. that situation, which is true for some operating environments.
Suppose the operating environments of the client and server are Suppose the operating environments of the client and server are
implementing the access() semantics for privileged users differently, implementing the access() semantics for privileged users differently,
and the ACCESS operation implementations of the client and server and the ACCESS operation implementations of the client and server
follow their respective access() semantics. This can cause undesired follow their respective access() semantics. This can cause undesired
skipping to change at page 432, line 30 skipping to change at page 432, line 30
attrset to determine which attributes were used to store the attrset to determine which attributes were used to store the
verifier. verifier.
With the addition of persistent sessions and pNFS, under some With the addition of persistent sessions and pNFS, under some
conditions EXCLUSIVE4 MUST NOT be used by the client or supported by conditions EXCLUSIVE4 MUST NOT be used by the client or supported by
the server. The following table summarizes the appropriate and the server. The following table summarizes the appropriate and
mandated exclusive create methods for implementations of NFSv4.1: mandated exclusive create methods for implementations of NFSv4.1:
Required methods for exclusive create Required methods for exclusive create
+--------------+--------+-----------------+-------------------------+ +----------------+-----------+---------------+----------------------+
| Persistent | pNFS | Server REQUIRED | Client Allowed | | Persistent | Server | Server | Client Allowed |
| Reply Cache | server | | | | Reply Cache | Supports | REQUIRED | |
+--------------+--------+-----------------+-------------------------+ | Enabled | pNFS | | |
| no | no | EXCLUSIVE4_1 | EXCLUSIVE4_1 (SHOULD) | +----------------+-----------+---------------+----------------------+
| | | and EXCLUSIVE4 | or EXCLUSIVE4 (SHOULD | | no | no | EXCLUSIVE4_1 | EXCLUSIVE4_1 |
| | | and | (SHOULD) or |
| | | EXCLUSIVE4 | EXCLUSIVE4 (SHOULD |
| | | | NOT) | | | | | NOT) |
| no | yes | EXCLUSIVE4_1 | EXCLUSIVE4_1 | | no | yes | EXCLUSIVE4_1 | EXCLUSIVE4_1 |
| yes | no | GUARDED4 | GUARDED4 | | yes | no | GUARDED4 | GUARDED4 |
| yes | yes | GUARDED4 | GUARDED4 | | yes | yes | GUARDED4 | GUARDED4 |
+--------------+--------+-----------------+-------------------------+ +----------------+-----------+---------------+----------------------+
Table 10 Table 10
If CREATE_SESSION4_FLAG_PERSIST is set in the results of If CREATE_SESSION4_FLAG_PERSIST is set in the results of
CREATE_SESSION the reply cache is persistent (see Section 18.36). If CREATE_SESSION the reply cache is persistent (see Section 18.36). If
the EXCHGID4_FLAG_USE_PNFS_MDS flag is set in the results from the EXCHGID4_FLAG_USE_PNFS_MDS flag is set in the results from
EXCHANGE_ID, the server is a pNFS server (see Section 18.35). If the EXCHANGE_ID, the server is a pNFS server (see Section 18.35). If the
client attempts to use EXCLUSIVE4 on a persistent session, or a client attempts to use EXCLUSIVE4 on a persistent session, or a
session derived from a EXCHGID4_FLAG_USE_PNFS_MDS client ID, the session derived from a EXCHGID4_FLAG_USE_PNFS_MDS client ID, the
server MUST return NFS4ERR_INVAL. server MUST return NFS4ERR_INVAL.
skipping to change at page 434, line 33 skipping to change at page 434, line 33
| CLAIM_DELEG_CUR_FH | OPEN as granted by the server. Generally | | CLAIM_DELEG_CUR_FH | OPEN as granted by the server. Generally |
| | this is done as part of recalling a | | | this is done as part of recalling a |
| | delegation. With CLAIM_DELEGATE_CUR, the | | | delegation. With CLAIM_DELEGATE_CUR, the |
| | file is identified by the current | | | file is identified by the current |
| | filehandle and the specified component | | | filehandle and the specified component |
| | name. With CLAIM_DELEG_CUR_FH (new to | | | name. With CLAIM_DELEG_CUR_FH (new to |
| | NFSv4.1), the file is identified by just | | | NFSv4.1), the file is identified by just |
| | the current filehandle. | | | the current filehandle. |
| CLAIM_DELEGATE_PREV, | The client is claiming a delegation | | CLAIM_DELEGATE_PREV, | The client is claiming a delegation |
| CLAIM_DELEG_PREV_FH | granted to a previous client instance; | | CLAIM_DELEG_PREV_FH | granted to a previous client instance; |
| | used after the client restarts. The | | | used after the client restarts. The server |
| | server MAY support CLAIM_DELEGATE_PREV or | | | MAY support CLAIM_DELEGATE_PREV or |
| | CLAIM_DELEG_PREV_FH (new to NFSv4.1). If | | | CLAIM_DELEG_PREV_FH (new to NFSv4.1). If |
| | it does support either open type, | | | it does support either open type, |
| | CREATE_SESSION MUST NOT remove the | | | CREATE_SESSION MUST NOT remove the |
| | client's delegation state, and the server | | | client's delegation state, and the server |
| | MUST support the DELEGPURGE operation. | | | MUST support the DELEGPURGE operation. |
+----------------------+--------------------------------------------+ +----------------------+--------------------------------------------+
For OPEN requests that reach the server during the grace period, the For OPEN requests that reach the server during the grace period, the
server returns an error of NFS4ERR_GRACE. The following claim types server returns an error of NFS4ERR_GRACE. The following claim types
are exceptions: are exceptions:
skipping to change at page 447, line 25 skipping to change at page 447, line 25
18.20.3. DESCRIPTION 18.20.3. DESCRIPTION
Replaces the current filehandle with the filehandle that represents Replaces the current filehandle with the filehandle that represents
the public filehandle of the server's name space. This filehandle the public filehandle of the server's name space. This filehandle
may be different from the "root" filehandle which may be associated may be different from the "root" filehandle which may be associated
with some other directory on the server. with some other directory on the server.
PUTPUBFH also clears the current stateid. PUTPUBFH also clears the current stateid.
The public filehandle represents the concepts embodied in RFC2054 The public filehandle represents the concepts embodied in RFC2054
[32], RFC2055 [33], RFC2224 [42]. The intent for NFSv4.1 is that the [31], RFC2055 [32], RFC2224 [41]. The intent for NFSv4.1 is that the
public filehandle (represented by the PUTPUBFH operation) be used as public filehandle (represented by the PUTPUBFH operation) be used as
a method of providing WebNFS server compatibility with NFSv3. a method of providing WebNFS server compatibility with NFSv3.
The public filehandle and the root filehandle (represented by the The public filehandle and the root filehandle (represented by the
PUTROOTFH operation) SHOULD be equivalent. If the public and root PUTROOTFH operation) SHOULD be equivalent. If the public and root
filehandles are not equivalent, then the public filehandle MUST be a filehandles are not equivalent, then the public filehandle MUST be a
descendant of the root filehandle. descendant of the root filehandle.
See Section 16.2.3.1.1 for more details on the current filehandle. See Section 16.2.3.1.1 for more details on the current filehandle.
skipping to change at page 447, line 47 skipping to change at page 447, line 47
18.20.4. IMPLEMENTATION 18.20.4. IMPLEMENTATION
Used as the second operator (after SEQUENCE) in an NFS request to set Used as the second operator (after SEQUENCE) in an NFS request to set
the context for file accessing operations that follow in the same the context for file accessing operations that follow in the same
COMPOUND request. COMPOUND request.
With the NFSv3 public filehandle, the client is able to specify With the NFSv3 public filehandle, the client is able to specify
whether the path name provided in the LOOKUP should be evaluated as whether the path name provided in the LOOKUP should be evaluated as
either an absolute path relative to the server's root or relative to either an absolute path relative to the server's root or relative to
the public filehandle. RFC2224 [42] contains further discussion of the public filehandle. RFC2224 [41] contains further discussion of
the functionality. With NFSv4.1, that type of specification is not the functionality. With NFSv4.1, that type of specification is not
directly available in the LOOKUP operation. The reason for this is directly available in the LOOKUP operation. The reason for this is
because the component separators needed to specify absolute vs. because the component separators needed to specify absolute vs.
relative are not allowed in NFSv4. Therefore, the client is relative are not allowed in NFSv4. Therefore, the client is
responsible for constructing its request such that the use of either responsible for constructing its request such that the use of either
PUTROOTFH or PUTPUBFH are used to signify absolute or relative PUTROOTFH or PUTPUBFH are used to signify absolute or relative
evaluation of an NFS URL respectively. evaluation of an NFS URL respectively.
Note that there are warnings mentioned in RFC2224 [42] with respect Note that there are warnings mentioned in RFC2224 [41] with respect
to the use of absolute evaluation and the restrictions the server may to the use of absolute evaluation and the restrictions the server may
place on that evaluation with respect to how much of its namespace place on that evaluation with respect to how much of its namespace
has been made available. These same warnings apply to NFSv4.1. It has been made available. These same warnings apply to NFSv4.1. It
is likely, therefore that because of server implementation details, is likely, therefore that because of server implementation details,
an NFSv3 absolute public filehandle lookup may behave differently an NFSv3 absolute public filehandle lookup may behave differently
than an NFSv4.1 absolute resolution. than an NFSv4.1 absolute resolution.
There is a form of security negotiation as described in RFC2755 [43] There is a form of security negotiation as described in RFC2755 [42]
that uses the public filehandle and an overloading of the pathname. that uses the public filehandle and an overloading of the pathname.
This method is not available with NFSv4.1 as filehandles are not This method is not available with NFSv4.1 as filehandles are not
overloaded with special meaning and therefore do not provide the same overloaded with special meaning and therefore do not provide the same
framework as NFSv3. Clients should therefore use the security framework as NFSv3. Clients should therefore use the security
negotiation mechanisms described in Section 2.6. negotiation mechanisms described in Section 2.6.
18.21. Operation 24: PUTROOTFH - Set Root Filehandle 18.21. Operation 24: PUTROOTFH - Set Root Filehandle
18.21.1. ARGUMENTS 18.21.1. ARGUMENTS
skipping to change at page 469, line 35 skipping to change at page 469, line 35
is lenient in this one case of matching owner values, the client is lenient in this one case of matching owner values, the client
implementation may be simplified in cases of creation of an object implementation may be simplified in cases of creation of an object
(e.g. an exclusive create via OPEN) followed by a SETATTR. (e.g. an exclusive create via OPEN) followed by a SETATTR.
The file size attribute is used to request changes to the size of a The file size attribute is used to request changes to the size of a
file. A value of zero causes the file to be truncated, a value less file. A value of zero causes the file to be truncated, a value less
than the current size of the file causes data from new size to the than the current size of the file causes data from new size to the
end of the file to be discarded, and a size greater than the current end of the file to be discarded, and a size greater than the current
size of the file causes logically zeroed data bytes to be added to size of the file causes logically zeroed data bytes to be added to
the end of the file. Servers are free to implement this using the end of the file. Servers are free to implement this using
unallocate bytes (holes) or allocated data bytes set to zero. unallocated bytes (holes) or allocated data bytes set to zero.
Clients should not make any assumptions regarding a server's Clients should not make any assumptions regarding a server's
implementation of this feature, beyond that the bytes in affected implementation of this feature, beyond that the bytes in affected
region returned by READ will be zeroed. Servers MUST support region returned by READ will be zeroed. Servers MUST support
extending the file size via SETATTR. extending the file size via SETATTR.
SETATTR is not guaranteed to be atomic. A failed SETATTR may SETATTR is not guaranteed to be atomic. A failed SETATTR may
partially change a file's attributes, hence the reason why the reply partially change a file's attributes, hence the reason why the reply
always includes the status and the list of attributes that were set. always includes the status and the list of attributes that were set.
If the object whose attributes are being changed has a file If the object whose attributes are being changed has a file
skipping to change at page 479, line 23 skipping to change at page 479, line 23
used with the integrity or privacy services, using the principal that used with the integrity or privacy services, using the principal that
created the client ID. If SP4_SSV is used, RPCSEC_GSS with the SSV created the client ID. If SP4_SSV is used, RPCSEC_GSS with the SSV
GSS mechanism (Section 2.10.8) and integrity or privacy MUST be used. GSS mechanism (Section 2.10.8) and integrity or privacy MUST be used.
If, when the client ID was created, the client opted for SP4_NONE If, when the client ID was created, the client opted for SP4_NONE
state protection, the client is not required to use state protection, the client is not required to use
BIND_CONN_TO_SESSION to associate the connection with the session, BIND_CONN_TO_SESSION to associate the connection with the session,
unless the client wishes to associate the connection with the unless the client wishes to associate the connection with the
backchannel. When SP4_NONE protection is used, simply sending a backchannel. When SP4_NONE protection is used, simply sending a
COMPOUND request with a SEQUENCE operation is sufficient to associate COMPOUND request with a SEQUENCE operation is sufficient to associate
the connnection with the session specified in SEQUENCE. the connection with the session specified in SEQUENCE.
The field bctsa_dir indicates whether the client wants to associate The field bctsa_dir indicates whether the client wants to associate
the connection with the fore channel or the backchannel or both the connection with the fore channel or the backchannel or both
channels. The value CDFC4_FORE_OR_BOTH indicates the client wants to channels. The value CDFC4_FORE_OR_BOTH indicates the client wants to
associate the connection with both the fore channel and backchannel, associate the connection with both the fore channel and backchannel,
but will accept the connection being associated to just the fore but will accept the connection being associated to just the fore
channel. The value CDFC4_BACK_OR_BOTH indicates the client wants to channel. The value CDFC4_BACK_OR_BOTH indicates the client wants to
associate with both the fore and backchannel, but will accept the associate with both the fore and backchannel, but will accept the
connection being associated with just the backchannel. The server connection being associated with just the backchannel. The server
replies in bctsr_dir which channel(s) the connection is associated replies in bctsr_dir which channel(s) the connection is associated
skipping to change at page 506, line 9 skipping to change at page 506, line 9
2. Sequence ID processing. If csa_sequenceid is equal to the 2. Sequence ID processing. If csa_sequenceid is equal to the
sequence ID in the client ID's slot, then this is a replay of the sequence ID in the client ID's slot, then this is a replay of the
previous CREATE_SESSION request, and the server returns the previous CREATE_SESSION request, and the server returns the
cached result. If csa_sequenceid is not equal to the sequence ID cached result. If csa_sequenceid is not equal to the sequence ID
in the slot, and is more than one greater (accounting for in the slot, and is more than one greater (accounting for
wraparound), then the server returns the error wraparound), then the server returns the error
NFS4ERR_SEQ_MISORDERED, and does not change the slot. If NFS4ERR_SEQ_MISORDERED, and does not change the slot. If
csa_sequenceid is equal to the slot's sequence ID + 1 (accounting csa_sequenceid is equal to the slot's sequence ID + 1 (accounting
for wraparound), then the slot's sequence ID is set to for wraparound), then the slot's sequence ID is set to
csa_sequenceid, and the CREATE_SESSION processing goes to the csa_sequenceid, and the CREATE_SESSION processing goes to the
next phase. A subsequent new CREATE_SESSION call MUST use a next phase. A subsequent new CREATE_SESSION call over the same
csa_sequence that is one greater than last successfully used. client ID MUST use a csa_sequenceid that is one greater than the
sequence ID in the slot.
3. Client ID confirmation. If this would be the first session for 3. Client ID confirmation. If this would be the first session for
the client ID, the CREATE_SESSION operation serves to confirm the the client ID, the CREATE_SESSION operation serves to confirm the
client ID. Otherwise the client ID confirmation phase is skipped client ID. Otherwise the client ID confirmation phase is skipped
and only the session creation phase occurs. Any case in which and only the session creation phase occurs. Any case in which
there is more than one record with identical values for client ID there is more than one record with identical values for client ID
represents a server implementation error. Operation in the represents a server implementation error. Operation in the
potential valid cases is summarized as follows. potential valid cases is summarized as follows.
* Successful Confirmation * Successful Confirmation
skipping to change at page 531, line 14 skipping to change at page 531, line 14
CB_NOTIFY_DEVICEID can race with LAYOUTGET. One race scenario is CB_NOTIFY_DEVICEID can race with LAYOUTGET. One race scenario is
that LAYOUTGET returns a device ID the client does not have device that LAYOUTGET returns a device ID the client does not have device
address mappings for, and the metadata server sends a address mappings for, and the metadata server sends a
CB_NOTIFY_DEVICEID to add the device ID to the client's awareness and CB_NOTIFY_DEVICEID to add the device ID to the client's awareness and
meanwhile the client sends GETDEVICEINFO on the device ID. This meanwhile the client sends GETDEVICEINFO on the device ID. This
scenario is discussed in Section 18.40.4. Another scenario is that scenario is discussed in Section 18.40.4. Another scenario is that
the CB_NOTIFY_DEVICEID is processed by the client before it processes the CB_NOTIFY_DEVICEID is processed by the client before it processes
the results from LAYOUTGET. The client will send a GETDEVICEINFO on the results from LAYOUTGET. The client will send a GETDEVICEINFO on
the device ID. If the results from GETDEVICEINFO are received before the device ID. If the results from GETDEVICEINFO are received before
the client gets results from LAYTOUTGET, then there is no longer a the client gets results from LAYOUTGET, then there is no longer a
race. If the results from LAYOUTGET are received before the results race. If the results from LAYOUTGET are received before the results
from GETDEVICEINFO, the client can either wait for results of from GETDEVICEINFO, the client can either wait for results of
GETDEVICEINFO, or send another one to get possibly more up to date GETDEVICEINFO, or send another one to get possibly more up to date
device address mappings for the device ID. device address mappings for the device ID.
18.44. Operation 51: LAYOUTRETURN - Release Layout Information 18.44. Operation 51: LAYOUTRETURN - Release Layout Information
18.44.1. ARGUMENT 18.44.1. ARGUMENT
/* Constants used for LAYOUTRETURN and CB_LAYOUTRECALL */ /* Constants used for LAYOUTRETURN and CB_LAYOUTRECALL */
skipping to change at page 540, line 33 skipping to change at page 540, line 33
When set indicates that one or more locks have been revoked When set indicates that one or more locks have been revoked
without expiration of the lease period, due to administrative without expiration of the lease period, due to administrative
action. This status bit remains set on all SEQUENCE replies until action. This status bit remains set on all SEQUENCE replies until
the loss of all such locks has been acknowledged by use of the loss of all such locks has been acknowledged by use of
FREE_STATEID. FREE_STATEID.
SEQ4_STATUS_RECALLABLE_STATE_REVOKED SEQ4_STATUS_RECALLABLE_STATE_REVOKED
When set indicates that one or more recallable objects have been When set indicates that one or more recallable objects have been
revoked without expiration of the lease period, due to the revoked without expiration of the lease period, due to the
client's failure to return them when recalled which may be a client's failure to return them when recalled which may be a
consequence of there being no working backchanel and the client consequence of there being no working backchannel and the client
failing to reestablish a backchannel per the failing to reestablish a backchannel per the
SEQ4_STATUS_CB_PATH_DOWN, SEQ4_STATUS_CB_PATH_DOWN_SESSION, or SEQ4_STATUS_CB_PATH_DOWN, SEQ4_STATUS_CB_PATH_DOWN_SESSION, or
SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED status flags. This status bit SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRED status flags. This status bit
remains set on all SEQUENCE replies until the loss of all such remains set on all SEQUENCE replies until the loss of all such
locks has been acknowledged by use of FREE_STATEID. locks has been acknowledged by use of FREE_STATEID.
SEQ4_STATUS_LEASE_MOVED SEQ4_STATUS_LEASE_MOVED
When set indicates that responsibility for lease renewal has been When set indicates that responsibility for lease renewal has been
transferred to one or more new servers. This condition will transferred to one or more new servers. This condition will
continue until the client receives an NFS4ERR_MOVED error and the continue until the client receives an NFS4ERR_MOVED error and the
skipping to change at page 549, line 28 skipping to change at page 549, line 28
WANT_DELEGATION operation to cancel a previously requested want for a WANT_DELEGATION operation to cancel a previously requested want for a
delegation. Note that if the server is in the process of sending the delegation. Note that if the server is in the process of sending the
delegation (via CB_PUSH_DELEG) at the time the client sends a delegation (via CB_PUSH_DELEG) at the time the client sends a
cancellation of the want, the delegation might still be pushed to the cancellation of the want, the delegation might still be pushed to the
client. client.
If WANT_DELEGATION fails to return a delegation, and the server If WANT_DELEGATION fails to return a delegation, and the server
returns NFS4_OK, the server MUST set the delegation type to returns NFS4_OK, the server MUST set the delegation type to
OPEN4_DELEGATE_NONE_EXT, and set od_whynone, as described in OPEN4_DELEGATE_NONE_EXT, and set od_whynone, as described in
Section 18.16. Write delegations are not available for file types Section 18.16. Write delegations are not available for file types
that are not writeable. This includes file objects of types: NF4BLK, that are not writable. This includes file objects of types: NF4BLK,
NF4CHR, NF4LNK, NF4SOCK, and NF4FIFO. If the client requests NF4CHR, NF4LNK, NF4SOCK, and NF4FIFO. If the client requests
OPEN4_SHARE_ACCESS_WANT_WRITE_DELEG without OPEN4_SHARE_ACCESS_WANT_WRITE_DELEG without
OPEN4_SHARE_ACCESS_WANT_READ_DELEG on an object with one of the OPEN4_SHARE_ACCESS_WANT_READ_DELEG on an object with one of the
aforementioned file types, the server must set aforementioned file types, the server must set
WND4_WRITE_DELEG_NOT_SUPP_FTYPE. WND4_WRITE_DELEG_NOT_SUPP_FTYPE.
18.49.4. IMPLEMENTATION 18.49.4. IMPLEMENTATION
A request for a conflicting delegation is not normally intended to A request for a conflicting delegation is not normally intended to
trigger the recall of the existing delegation. Servers may choose to trigger the recall of the existing delegation. Servers may choose to
skipping to change at page 568, line 47 skipping to change at page 568, line 47
returning NFS4ERR_DELAY or permanently rejecting the offer of the returning NFS4ERR_DELAY or permanently rejecting the offer of the
delegation by returning NFS4ERR_REJECT_DELEG. When a delegation is delegation by returning NFS4ERR_REJECT_DELEG. When a delegation is
rejected in this fashion, the want previously established is rejected in this fashion, the want previously established is
permanently deleted and the delegation is subject to acquisition by permanently deleted and the delegation is subject to acquisition by
another client. another client.
20.5.4. IMPLEMENTATION 20.5.4. IMPLEMENTATION
If the client does return NFS4ERR_DELAY and there is a conflicting If the client does return NFS4ERR_DELAY and there is a conflicting
delegation request, the server MAY process it at the expense of the delegation request, the server MAY process it at the expense of the
client that returned NFS4ERR_DELAY. The client's want will typically client that returned NFS4ERR_DELAY. The client's want will not be
not be cancelled, but MAY processed behind other delegation requests cancelled, but MAY processed behind other delegation requests or
or registered wants. registered wants.
When a client returns a status other than NFS4_OK, NFSERR_DELAY, or When a client returns a status other than NFS4_OK, NFS4ERR_DELAY, or
NFS4ERR_REJECT_DELAY, the want remains pending, although servers may NFS4ERR_REJECT_DELAY, the want remains pending, although servers may
decide to cancel the want by sending a CB_WANTS_CANCELLED. decide to cancel the want by sending a CB_WANTS_CANCELLED.
20.6. Operation 8: CB_RECALL_ANY - Keep any N recallable objects 20.6. Operation 8: CB_RECALL_ANY - Keep any N recallable objects
Notify client to return all but N recallable objects. Notify client to return all but N recallable objects.
20.6.1. ARGUMENT 20.6.1. ARGUMENT
const RCA4_TYPE_MASK_RDATA_DLG = 0; const RCA4_TYPE_MASK_RDATA_DLG = 0;
skipping to change at page 571, line 7 skipping to change at page 571, line 7
RCA4_TYPE_MASK_DIR_DLG RCA4_TYPE_MASK_DIR_DLG
The client is to return directory delegations. The client is to return directory delegations.
RCA4_TYPE_MASK_FILE_LAYOUT RCA4_TYPE_MASK_FILE_LAYOUT
The client is to return layouts of type LAYOUT4_NFSV4_1_FILES. The client is to return layouts of type LAYOUT4_NFSV4_1_FILES.
RCA4_TYPE_MASK_BLK_LAYOUT RCA4_TYPE_MASK_BLK_LAYOUT
See [31] for a description. See [30] for a description.
RCA4_TYPE_MASK_OBJ_LAYOUT_MIN to RCA4_TYPE_MASK_OBJ_LAYOUT_MAX RCA4_TYPE_MASK_OBJ_LAYOUT_MIN to RCA4_TYPE_MASK_OBJ_LAYOUT_MAX
See [30] for a description. See [29] for a description.
RCA4_TYPE_MASK_OTHER_LAYOUT_MIN to RCA4_TYPE_MASK_OTHER_LAYOUT_MAX RCA4_TYPE_MASK_OTHER_LAYOUT_MIN to RCA4_TYPE_MASK_OTHER_LAYOUT_MAX
This range is reserved for telling the client to recall layouts of This range is reserved for telling the client to recall layouts of
experimental or site specific layout types (see Section 3.3.13). experimental or site specific layout types (see Section 3.3.13).
When a bit is set in the type mask that corresponds to an undefined When a bit is set in the type mask that corresponds to an undefined
type of recallable object, NFS4ERR_INVAL MUST be returned. When a type of recallable object, NFS4ERR_INVAL MUST be returned. When a
bit is set that corresponds to a defined type of object, but the bit is set that corresponds to a defined type of object, but the
client does not support an object of the type, NFS4ERR_INVAL MUST NOT client does not support an object of the type, NFS4ERR_INVAL MUST NOT
skipping to change at page 583, line 7 skipping to change at page 583, line 7
GETATTR for the fs_locations or fs_locations_info attributes, the GETATTR for the fs_locations or fs_locations_info attributes, the
attacker modifies the results to cause the client migrate its attacker modifies the results to cause the client migrate its
traffic to a server controlled by the attacker. traffic to a server controlled by the attacker.
Relative to previous NFS versions, NFSv4.1 has additional security Relative to previous NFS versions, NFSv4.1 has additional security
considerations for pNFS (see Section 12.9 and Section 13.12), locking considerations for pNFS (see Section 12.9 and Section 13.12), locking
and session state (see Section 2.10.7.3). and session state (see Section 2.10.7.3).
22. IANA Considerations 22. IANA Considerations
This section uses terms that are defined in [43].
22.1. Named Attribute Definitions 22.1. Named Attribute Definitions
IANA will create a registry called the "NFSv4 Named Attribute
Definitions Registry".
The NFSv4.1 protocol supports the association of a file with zero or The NFSv4.1 protocol supports the association of a file with zero or
more named attributes. The name space identifiers for these more named attributes. The name space identifiers for these
attributes are defined as string names. The protocol does not define attributes are defined as string names. The protocol does not define
the specific assignment of the name space for these file attributes. the specific assignment of the name space for these file attributes.
Even though the name space is not specifically controlled to prevent An IANA registry will promote interoperability where common interests
collisions, an IANA registry has been created for the registration of exist. While application developers are allowed to define and use
NFSv4.1 named attributes. Registration will be achieved through the attributes as needed, they are encouraged to register the attributes
publication of an Informational RFC and will require not only the with IANA.
name of the attribute but the syntax and semantics of the named
attribute contents; the intent is to promote interoperability where
common interests exist. While application developers are allowed to
define and use attributes as needed, they are encouraged to register
the attributes with IANA.
Such registered named attributes are presumed to apply to all minor Such registered named attributes are presumed to apply to all minor
versions of NFSv4, including those defined subsequently to the versions of NFSv4, including those defined subsequently to the
registration. Where the named attribute is intended to be limited registration. Where the named attribute is intended to be limited
with regard to the minor versions for which they are not be used, the with regard to the minor versions for which they are not be used, the
Informational RFC must clearly state the applicable limits. assignment in registry will clearly state the applicable limits.
22.2. ONC RPC Network Identifiers (netids) All assignments to the registry are made on a First Come First Served
basis, per section 4.1 of [43]. The policy for each assignment is
Specification Required, per section 4.1 of [43].
Section 3.3.9) discussed the r_netid field and the corresponding Under the NFSv4.1 specification, the name of a named attribute can in
r_addr field within a netaddr4 structure. The NFSv4 protocol depends theory be up to 2^32 - 1 bytes in length, but in practice NFSv4.1
on the syntax and semantics of these fields to effectively clients and servers will be unable to a handle string that long.
communicate callback and other information between client and server. IANA should reject any assignment request with a named attribute that
Therefore, an IANA registry has been created to include the values exceeds 128 UTF-8 characters. To give IESG the flexibility to set up
defined in this document and to allow for future expansion based on bases of assignment of Experimental Use and Standards Action, the
transport usage/availability. Additions to this ONC RPC Network prefixes of "EXPE" and "STDS" are Reserved. The zero length named
Identifier registry must be done with the publication of an RFC. attribute name is Reserved.
The initial values for this registry are as follows (some of this The prefix "PRIV" is allocated for Private Use. A site that wants to
text is replicated from Section 3.3.9 for clarity): make use of unregistered named attributes without risk of conflicting
with an assignment in IANA's registry should use the prefix "PRIV" in
all of its named attributes.
The Network Identifier (or r_netid for short) is used to specify a Because some NFSv4.1 clients and servers have case insensitive
transport protocol and associated universal address (or r_addr for semantics, the fifteen additional lower case and mixed case
short). The syntax of the Network Identifier is a US-ASCII string. permutations of each of "EXPE", "PRIV", and "STDS", are Reserved
The initial definitions for r_netid are: (e.g. "expe", "expE", "exPe", etc. are Reserved). Similarly, IANA
must not allow two assignments that would conflict if both named
attributes were converted to a common case.
"tcp" - TCP over IP version 4 The registry of named attributes is a list of assignments, each
containing three fields for each assignment.
"udp" - UDP over IP version 4 1. A US-ASCII string name that is the actual name of the attribute.
This name must be unique. This string name can be 1 to 128 UTF-8
characters long.
"tcp6" - TCP over IP version 6 2. A reference to the specification of the named attribute. The
reference can consume up to 256 bytes (or more if IANA permits).
"udp6" - UDP over IP version 6 3. The point of contact of the registrant. The point of contact can
consume up to 256 bytes (or more if IANA permits).
Note: the '"' marks are used for delimiting the strings for this 22.1.1. Initial Registry
document and are not part of the Network Identifier string.
For the "tcp" and "udp" Network Identifiers the Universal Address or There is no initial registry.
r_addr (for IPv4) is a US-ASCII string and is of the form described
in Section 3.3.9.1.
For the "tcp" and "udp" Network Identifiers the Universal Address or 22.1.2. Updating Registrations
r_addr (for IPv6) is a US-ASCII string and is of the form described
in Section 3.3.9.2.
As mentioned, the registration of new Network Identifiers will The registrant is always permitted to update the point of contact
require the publication of an RFC with similar detail as listed above field. To make any other change will require Expert Review or IESG
for the Network Identifier itself and corresponding Universal Approval.
Address.
22.3. Defining New Notifications 22.2. Device ID Notifications
New notification types may be added to the CB_NOTIFY_DEVICEID IANA will create a registry called the "NFSv4.1 Device ID
operation Section 20.12. This can be done via changes to the Notifications Registry".
operations that register notifications, or by adding new operations
to NFSv4. This requires a new minor version of NFSv4, and requires a
standards track document from IETF. Another way to add a
notification is to specify a new layout type. Notifications for new
layout types would be requested via GETDEVICELIST (Section 18.41) and
GETDEVICEINFO (Section 18.40). See Section 22.4).
22.4. Defining New Layout Types The potential exists for new notification types to be added to the
CB_NOTIFY_DEVICEID operation Section 20.12. This can be done via
changes to the operations that register notifications, or by adding
new operations to NFSv4. This requires a new minor version of NFSv4,
and requires a standards track document from IETF. Another way to
add a notification is to specify a new layout type (see
Section 22.4).
New layout type numbers will be requested from IANA. IANA will only Hence all assignments to the registry are made on a Standards Action
provide layout type numbers for Standards Track RFCs approved by the basis per section 4.1 of [43], with Expert Review required.
IESG, in accordance with Standards Action policy defined in [20].
All layout types assigned by IANA MUST be in the range 0x00000001 to The registry is a list of assignments, each containing five fields
0x7FFFFFFF. per assignment.
1. The name of the notification type. This name must have the
prefix: "NOTIFY_DEVICEID4_". This name must be unique.
2. The value of the notification. IANA will assign this number, and
the request from the registrant will use TBD1 instead of an
actual value. IANA MUST use a whole number which can be no
higher than 2^32-1, and should be the next available value. The
value assigned must be unique. A Designated Expert must be used
to ensure that when the name of the notification type and its
value are added to the NFSv4.1 notify_deviceid_type4 enumerated
data type in the NFSv4.1 XDR description ([12]), the result
continues to be a valid XDR description.
3. The Standards Track RFC(s) that describe the notification. If
the RFC(s) have not yet been published, the registrant will use
RFCTBD2, RFCTBD3, etc. instead of an actual RFC number.
4. How the RFC introduces the notification. This is indicated by a
single US-ASCII value. If the value is N, it means a minor
revision to the NFSv4 protocol. If the value is L, it means a
new pNFS layout type. Other values can be used with IESG
Approval.
5. The minor versions of NFSv4 that are allowed to the use the
notification. While these are numeric values, IANA will not
allocate and assign them; the author of the relevant RFCs with
IESG Approval assigns these numbers. Each time there is new
minor version of NFSv4 approved, a Designated Expert should
review the registry to make recommended updates as needed.
22.2.1. Initial Registry
The initial registry is in Table 15. Note that next available value
is zero.
+-------------------------+-------+----------+-----+----------------+
| Notification Name | Value | RFC | How | Minor Versions |
+-------------------------+-------+----------+-----+----------------+
| NOTIFY_DEVICEID4_CHANGE | 1 | RFCTBD10 | N | 1 |
| NOTIFY_DEVICEID4_DELETE | 2 | RFCTBD10 | N | 1 |
+-------------------------+-------+----------+-----+----------------+
Table 15: Initial Device ID Notification Assignments
22.2.2. Updating Registrations
The update of a registration will require IESG Approval on the advice
of a Designated Expert.
22.3. Object Recall Types
IANA will create a registry called the "NFSv4.1 Recallable Object
Types Registry".
The potential exists for new object types to be be added to the
CB_RECALL_ANY operation (see Section 20.6). This can be done via
changes to the operations that add recallable types, or by adding new
operations to NFSv4. This requires a new minor version of NFSv4, and
requires a standards track document from IETF. Another way to add a
new recallable object is to specify a new layout type (see
Section 22.4).
All assignments to the registry are made on a Standards Action basis
per section 4.1 of [43], with Expert Review required.
Recallable object types are 32 bit unsigned numbers. There are no
Reserved values. Values in the range 12 through 15, inclusive, are
for Private Use.
The registry is a list of assignments, each containing five fields
per assignment.
1. The name of the recallable object type. This name must have the
prefix: "RCA4_TYPE_MASK_". The name must be unique.
2. The value of the recallable object type. IANA will assign this
number, and the request from the registrant will use TBD1 instead
of an actual value. IANA MUST use a whole number which can be no
higher than 2^32-1, and should be the next available value. The
value must be unique. A Designated Expert must be used to ensure
that when the name of the recallable type and its value are added
to the NFSv4 XDR description [12], the result continues to be a
valid XDR description.
3. The Standards Track RFC(s) that describe the recallable object
type. If the RFC(s) have not yet been published, the registrant
will use RFCTBD2, RFCTBD3, etc. instead of an actual RFC number.
4. How the RFC introduces the recallable object type. This is
indicated by a single US-ASCII value. If the value is N, it
means a minor revision to the NFSv4 protocol. If the value is L,
it means a new pNFS layout type. Other values can be used with
IESG Approval.
5. The minor versions of NFSv4 that are allowed to the use the
recallable object type. While these are numeric values, IANA
will not allocate and assign them; the author of the relevant
RFCs with IESG Approval assigns these numbers. Each time there
is new minor version of NFSv4 approved, a Designated Expert
should review the registry to make recommended updates as needed.
22.3.1. Initial Registry
The initial registry is in Table 16. Note that next available value
is five.
+-------------------------------+-------+----------+-----+----------+
| Recallable Object Type Name | Value | RFC | How | Minor |
| | | | | Versions |
+-------------------------------+-------+----------+-----+----------+
| RCA4_TYPE_MASK_RDATA_DLG | 0 | RFCTBD10 | N | 1 |
| RCA4_TYPE_MASK_WDATA_DLG | 1 | RFCTBD10 | N | 1 |
| RCA4_TYPE_MASK_DIR_DLG | 2 | RFCTBD10 | N | 1 |
| RCA4_TYPE_MASK_FILE_LAYOUT | 3 | RFCTBD10 | N | 1 |
| RCA4_TYPE_MASK_BLK_LAYOUT | 4 | RFCTBD20 | L | 1 |
| RCA4_TYPE_MASK_OBJ_LAYOUT_MIN | 8 | RFCTBD30 | L | 1 |
| RCA4_TYPE_MASK_OBJ_LAYOUT_MAX | 9 | RFCTBD30 | L | 1 |
+-------------------------------+-------+----------+-----+----------+
Table 16: Initial Recallable Object Type Assignments
22.3.2. Updating Registrations
The update of a registration will require IESG Approval on the advice
of a Designated Expert.
22.4. Layout Types
IANA will create a registry called the "pNFS Layout Types Registry".
All assignments to the registry are made on a Standards Action basis,
with Expert Review required.
Layout types are 32 bit numbers. The value zero is Reserved. Values
in the range 0x80000000 to 0xFFFFFFFF inclusive are for Private Use.
IANA will assign numbers from the range 0x00000001 to 0x7FFFFFFF
inclusive.
The registry is a list of assignments, each containing five fields.
1. The name of the layout type. This name must have the prefix:
"LAYOUT4_". The name must be unique.
2. The value of the layout type. IANA will assign this number, and
the request from the registrant will use TBD1 instead of an
actual value. The value assigned must be unique. A Designated
Expert must be used to ensure that when the name of the layout
type and its value are added to the NFSv4.1 layouttype4
enumerated data type in the NFSv4.1 XDR description ([12]), the
result continues to be a valid XDR description.
3. The Standards Track RFC(s) that describe the notification. If
the RFC(s) have not yet been published, the registrant will use
RFCTBD2, RFCTBD3, etc. instead of an actual RFC number.
Collectively, the RFC(s) must adhere to the guidelines listed in
Section 22.4.3.
4. How the RFC introduces the notification. This is indicated by a
single US-ASCII value. If the value is N, it means a minor
revision to the NFSv4 protocol. If the value is L, it means a
new pNFS layout type. Other values can be used with IESG
Approval.
5. The minor versions of NFSv4 that are allowed to the use the
notification. While these are numeric values, IANA will not
allocate and assign them; the author of the relevant RFCs with
IESG Approval assigns these numbers. Each time there is new
minor version of NFSv4 approved, a Designated Expert should
review the registry to make recommended updates as needed.
22.4.1. Initial Registry
The initial registry is in Table 17.
+-----------------------+-------+----------+-----+----------------+
| Layout Type Name | Value | RFC | How | Minor Versions |
+-----------------------+-------+----------+-----+----------------+
| LAYOUT4_NFSV4_1_FILES | 0x1 | RFCTBD10 | N | 1 |
| LAYOUT4_OSD2_OBJECTS | 0x2 | RFCTBD30 | L | 1 |
| LAYOUT4_BLOCK_VOLUME | 0x3 | RFCTBD20 | L | 1 |
+-----------------------+-------+----------+-----+----------------+
Table 17: Initial Layout Type Assignments
22.4.2. Updating Registrations
The update of a registration will require IESG Approval on the advice
of a Designated Expert.
22.4.3. Guidelines for Writing Layout Type Specifications
The author of a new pNFS layout specification must follow these steps The author of a new pNFS layout specification must follow these steps
to obtain acceptance of the layout type as a standard: to obtain acceptance of the layout type as a Standards Track RFC:
1. The author devises the new layout specification. 1. The author devises the new layout specification.
2. The new layout type specification MUST, at a minimum: 2. The new layout type specification MUST, at a minimum:
* Define the contents of the layout-type-specific fields of the * Define the contents of the layout-type-specific fields of the
following data types: following data types:
+ the da_addr_body field of the device_addr4 data type; + the da_addr_body field of the device_addr4 data type;
+ the loh_body field of the layouthint4 data type; + the loh_body field of the layouthint4 data type;
+ the loc_body field of layout_content4 data type (which in + the loc_body field of layout_content4 data type (which in
turn is the lo_content field of the layout4 data type); turn is the lo_content field of the layout4 data type);
+ the lou_body field of the layoutupdate4 data type; + the lou_body field of the layoutupdate4 data type;
* Describe or define the storage access protocol used to access * Describe or define the storage access protocol used to access
the data servers the storage devices.
* Describe whether revocation of layouts is supported. * Describe whether revocation of layouts is supported.
* At a minimum, describe the methods of recovery from: * At a minimum, describe the methods of recovery from:
1. Failure and restart for client, server, storage device. 1. Failure and restart for client, server, storage device.
2. Lease expiration from perspective of the active client, 2. Lease expiration from perspective of the active client,
server, storage device. server, storage device.
3. Loss of layout state resulting in fencing of client access 3. Loss of layout state resulting in fencing of client access
to storage devices (for an example, see Section 12.7.3). to storage devices (for an example, see Section 12.7.3).
* A list of any new notification values for CB_NOTIFY_DEVICEID. * Include an IANA considerations section, will in turn include:
* A list of any new recallable object types for CB_RECALL_ANY. + A request to IANA for a new layout type per Section 22.4.
* Include an IANA considerations section. + A list of requests to IANA for any new recallable object
types for CB_RECALL_ANY; each entry is to presented in the
form described in Section 22.3.
+ A list of requests to IANA for any new notification values
for CB_NOTIFY_DEVICEID; each entry is to presented in the
form described in Section 22.2.
* Include a security considerations section. * Include a security considerations section.
3. The author documents the new layout specification as an Internet 3. The author documents the new layout specification as an Internet
Draft. Draft.
4. The author submits the Internet Draft for review through the IETF 4. The author submits the Internet Draft for review through the IETF
standards process as defined in "Internet Official Protocol standards process as defined in "Internet Official Protocol
Standards" (STD 1). The new layout specification will be Standards" (STD 1). The new layout specification will be
submitted for eventual publication as a standards track RFC. submitted for eventual publication as a standards track RFC.
skipping to change at page 586, line 12 skipping to change at page 590, line 18
process; the new option will be reviewed by the NFSv4 Working process; the new option will be reviewed by the NFSv4 Working
Group (if that group still exists), or as an Internet Draft not Group (if that group still exists), or as an Internet Draft not
submitted by an IETF working group. submitted by an IETF working group.
22.5. Path Variable Definitions 22.5. Path Variable Definitions
This section deals with the IANA considerations associated with the This section deals with the IANA considerations associated with the
variable substitution feature for location names as described in variable substitution feature for location names as described in
Section 11.10.3. As described there, variables subject to Section 11.10.3. As described there, variables subject to
substitution consist of a domain name and a specific name within that substitution consist of a domain name and a specific name within that
domain, with two separated by a colon. domain, with two separated by a colon. There are two sets of IANA
considerations here:
22.5.1. Path Variable Values 1. The list of variable names.
For names with the domain "ietf.org" only three specific names are 2. For each variable name, the list of possible values.
currently defined and additional names will only be created via
standards-track RFC's.
For the variable names ${ietf.org:CPU_ARCH} and ${ietf.org:OS_TYPE}, Thus, there will be one registry for the list of variable names, and
IANA will have to create a registry of values to be used for that possibly one registry for listing the values of each variable name.
variable. Applications for such values must contain the variable
name, the proposed value of that variable, and a brief (one or two
paragraphs) explanation of what is indicated by that specific value.
Such requests should be reviewed by nfsv4@ietf.org and a Designated
Expert.
For the name ${ietf.org:OS_VERSION}, no such registry need be created 22.5.1. Path Variables Registry
as the specifics of the values will vary with the value of
${ietf.org:OS_TYPE}.
22.5.2. Path Variable Names IANA will create a registry called the "NFSv4 Path Variables
Registry".
IANA needs to set up a registry to help make generally available 22.5.1.1. Path Variable Values
information about variables of the form ${domain:var}, where domain
is something other than "ietf.org".
Applications for the addition of variables to this registry should Variable names are of the form "${", followed by a domain name,
contain the name of the variable and a brief (one or a few followed by a colon (":"), followed by a domain-specific portion of
paragraphs) explanation of the purpose of the variable. No review of the variable name, followed by "}". When the domain name is
these applications by IANA is necessary. "ietf.org" all variables names must be registered with IANA on a
Standards Action basis, with Expert Review required. Path variables
with registered domain names neither part of nor equal to ietf.org
are assigned on a Hierarchical Allocation basis (delegating to the
domain owner) and thus of no concern to IANA, unless the domain owner
chooses to register a variable name from his domain. If the domain
owner chooses to do so, IANA will do so on a First Come First Serve
basis. To accommodate registrants who do not have their own domain,
IANA will accept requests to register variables with the prefix
"${FCFS.ietf.org:" on a First Come First Served basis. Assignments
on a First Come First Basis do not require Expert Review, unless the
registrant also wants IANA to establish a registry for the values of
the registered variable.
The registry is a list of assignments, each containing three fields.
1. The name of the variable. The name of this variable must start
with a "${" followed by a registered domain name, followed by
":", or it must start with "${FCFS.ietf.org". The name must be
no more than 64 UTF-8 characters long. The name must be unique.
2. For assignments made on Standards Action basis, the Standards
Track RFC(s) that describe the variable. If the RFC(s) have not
yet been published, the registrant will use RFCTBD1, RFCTBD2,
etc. instead of an actual RFC number. Note that the RFCs do not
have to be a part of a NFS minor version. For assignments made
on a First Come First Serve basis, an explanation (consuming no
more than 1024 bytes, or more if IANA permits) of the purpose of
the variable. A reference to the explanation can be substituted.
3. The point of contact, including an email address. The point of
contact can consume up to 256 bytes (or more if IANA permits).
For assignments made on a Standards Action basis, the point of
contact is always IESG.
22.5.1.1.1. Initial Registry
The initial registry is in Table 18.
+------------------------+----------+------------------+
| Variable Name | RFC | Point of Contact |
+------------------------+----------+------------------+
| ${ietf.org:CPU_ARCH} | RFCTBD10 | IESG |
| ${ietf.org:OS_TYPE} | RFCTBD10 | IESG |
| ${ietf.org:OS_VERSION} | RFCTBD10 | IESG |
+------------------------+----------+------------------+
Table 18: Initial List of Path Variables
IANA will need to create registries for the values of the variable
names ${ietf.org:CPU_ARCH} and ${ietf.org:OS_TYPE}. See
Section 22.5.2 and Section 22.5.3.
For the values of the variable ${ietf.org:OS_VERSION}, no registry is
needed as the specifics of the values of the variable will vary with
the value of ${ietf.org:OS_TYPE}. Thus values for ${ietf.org:
OS_VERSION} are on a Hierarchical Allocation basis and are of no
concern to IANA.
22.5.1.1.2. Updating Registrations
The update of an assignment made on a Standards Action basis will
require IESG Approval on the advice of a Designated Expert.
The registrant can always updated the point of contact of an
assignment made on a First Come First Serve basis. Any other update
will require Expert Review.
22.5.2. Values for the ${ietf.org:CPU_ARCH} Variable
IANA will create a registry called the "NFSv4 ${ietf.org:CPU_ARCH}
Value Registry".
Assignments to the registry are made on a First Come First Serve
basis. The zero length value of ${ietf.org:CPU_ARCH} is Reserved.
Values with a prefix of "PRIV" are Reserved for Private Use.
The registry is a list of assignments, each containing three fields.
1. A value of the ${ietf.org:CPU_ARCH} variable. The value must be
1 to 32 UTF-8 characters long. The value must be unique.
2. An explanation (consuming no more than 1024 bytes, or more if
IANA permits) of what CPU architecture the value denotes. A
reference to the explanation can be substituted.
3. The point of contact, including an email address. The point of
contact can consume up to 256 bytes (or more if IANA permits).
22.5.2.1. Initial Registry
There is no initial registry.
22.5.2.2. Updating Registrations
The registrant is free to update the assignment, i.e. change the
explanation and/or point of contact fields.
22.5.3. Values for the ${ietf.org:OS_TYPE} Variable
IANA will create a registry called the "NFSv4 ${ietf.org:OS_TYPE}
Value Registry".
Assignments to the registry are made on a First Come First Serve
basis. The zero length value of ${ietf.org:OS_TYPE} is Reserved.
Values with a prefix of "PRIV" are Reserved for Private Use.
The registry is a list of assignments, each containing three fields.
1. A value of the ${ietf.org:OS_TYPE} variable. The value must be 1
to 32 UTF-8 characters long. The value must be unique.
2. An explanation (consuming no more than 1024 bytes, or more if
IANA permits) of what CPU architecture the value denotes. A
reference to the explanation can be substituted.
3. The point of contact, including an email address. The point of
contact can consume up to 256 bytes (or more if IANA permits).
22.5.3.1. Initial Registry
There is no initial registry.
22.5.3.2. Updating Registrations
The registrant is free to update the assignment, i.e. change the
explanation and/or point of contact fields.
23. References 23. References
23.1. Normative References 23.1. Normative References
[1] Bradner, S., "Key words for use in RFCs to Indicate Requirement [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement
Levels", March 1997. Levels", March 1997.
[2] Eisler, M., "XDR: External Data Representation Standard", [2] Eisler, M., "XDR: External Data Representation Standard",
STD 67, RFC 4506, May 2006. STD 67, RFC 4506, May 2006.
skipping to change at page 587, line 38 skipping to change at page 594, line 21
April 2008. April 2008.
[10] Recio, P., Metzler, B., Culley, P., Hilland, J., and D. Garcia, [10] Recio, P., Metzler, B., Culley, P., Hilland, J., and D. Garcia,
"A Remote Direct Memory Access Protocol Specification", "A Remote Direct Memory Access Protocol Specification",
RFC 5040, October 2007. RFC 5040, October 2007.
[11] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-Hashing [11] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-Hashing
for Message Authentication", RFC 2104, February 1997. for Message Authentication", RFC 2104, February 1997.
[12] Shepler, S., Eisler, M., and D. Noveck, "NFSv4 Minor Version 1 [12] Shepler, S., Eisler, M., and D. Noveck, "NFSv4 Minor Version 1
XDR Description", draft-ietf-nfsv4-minorversion1-dot-x-07 (work XDR Description", draft-ietf-nfsv4-minorversion1-dot-x-08 (work
in progress), May 2008. in progress), Aug 2008.
[13] Hinden, R. and S. Deering, "IP Version 6 Addressing [13] Eisler, M., "IANA Considerations for RPC Net Identifiers and
Architecture", RFC 4291, February 2006. Universal Address Formats", draft-ietf-nfsv4-rpc-netid-03 (work
in progress), Aug 2008.
[14] International Organization for Standardization, "Information [14] International Organization for Standardization, "Information
Technology - Universal Multiple-octet coded Character Set (UCS) Technology - Universal Multiple-octet coded Character Set (UCS)
- Part 1: Architecture and Basic Multilingual Plane", - Part 1: Architecture and Basic Multilingual Plane",
ISO Standard 10646-1, May 1993. ISO Standard 10646-1, May 1993.
[15] Alvestrand, H., "IETF Policy on Character Sets and Languages", [15] Alvestrand, H., "IETF Policy on Character Sets and Languages",
BCP 18, RFC 2277, January 1998. BCP 18, RFC 2277, January 1998.
[16] Hoffman, P. and M. Blanchet, "Preparation of Internationalized [16] Hoffman, P. and M. Blanchet, "Preparation of Internationalized
skipping to change at page 588, line 20 skipping to change at page 595, line 5
March 2003. March 2003.
[18] Schaad, J., Kaliski, B., and R. Housley, "Additional Algorithms [18] Schaad, J., Kaliski, B., and R. Housley, "Additional Algorithms
and Identifiers for RSA Cryptography for use in the Internet and Identifiers for RSA Cryptography for use in the Internet
X.509 Public Key Infrastructure Certificate and Certificate X.509 Public Key Infrastructure Certificate and Certificate
Revocation List (CRL) Profile", RFC 4055, June 2005. Revocation List (CRL) Profile", RFC 4055, June 2005.
[19] National Institute of Standards and Technology, "Cryptographic [19] National Institute of Standards and Technology, "Cryptographic
Algorithm Object Registration", December 2005. Algorithm Object Registration", December 2005.
[20] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA
Considerations Section in RFCs", BCP 26, RFC 5226, May 2008.
23.2. Informative References 23.2. Informative References
[21] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame, [20] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame,
C., Eisler, M., and D. Noveck, "Network File System (NFS) C., Eisler, M., and D. Noveck, "Network File System (NFS)
version 4 Protocol", RFC 3530, April 2003. version 4 Protocol", RFC 3530, April 2003.
[22] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS Version 3 [21] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS Version 3
Protocol Specification", RFC 1813, June 1995. Protocol Specification", RFC 1813, June 1995.
[23] Eisler, M., "NFS Version 2 and Version 3 Security Issues and [22] Eisler, M., "NFS Version 2 and Version 3 Security Issues and
the NFS Protocol's Use of RPCSEC_GSS and Kerberos V5", the NFS Protocol's Use of RPCSEC_GSS and Kerberos V5",
RFC 2623, June 1999. RFC 2623, June 1999.
[24] Juszczak, C., "Improving the Performance and Correctness of an [23] Juszczak, C., "Improving the Performance and Correctness of an
NFS Server", USENIX Conference Proceedings , June 1990. NFS Server", USENIX Conference Proceedings , June 1990.
[25] Reynolds, J., "Assigned Numbers: RFC 1700 is Replaced by an On- [24] Reynolds, J., "Assigned Numbers: RFC 1700 is Replaced by an On-
line Database", RFC 3232, January 2002. line Database", RFC 3232, January 2002.
[26] Srinivasan, R., "Binding Protocols for ONC RPC Version 2", [25] Srinivasan, R., "Binding Protocols for ONC RPC Version 2",
RFC 1833, August 1995. RFC 1833, August 1995.
[27] Werme, R., "RPC XID Issues", USENIX Conference Proceedings , [26] Werme, R., "RPC XID Issues", USENIX Conference Proceedings ,
February 1996. February 1996.
[28] Nowicki, B., "NFS: Network File System Protocol specification", [27] Nowicki, B., "NFS: Network File System Protocol specification",
RFC 1094, March 1989. RFC 1094, March 1989.
[29] Bhide, A., Elnozahy, E., and S. Morgan, "A Highly Available [28] Bhide, A., Elnozahy, E., and S. Morgan, "A Highly Available
Network Server", USENIX Conference Proceedings , January 1991. Network Server", USENIX Conference Proceedings , January 1991.
[30] Halevy, B., Welch, B., and J. Zelenka, "Object-based pNFS [29] Halevy, B., Welch, B., and J. Zelenka, "Object-based pNFS
Operations", draft-ietf-nfsv4-pnfs-obj-09 (work in progress), Operations", draft-ietf-nfsv4-pnfs-obj-09 (work in progress),
June 2008. June 2008.
[31] Black, D., Fridella, S., and J. Glasgow, "pNFS Block/Volume [30] Black, D., Fridella, S., and J. Glasgow, "pNFS Block/Volume
Layout", draft-ietf-nfsv4-pnfs-block-09 (work in progress), Layout", draft-ietf-nfsv4-pnfs-block-09 (work in progress),
June 2008. June 2008.
[32] Callaghan, B., "WebNFS Client Specification", RFC 2054, [31] Callaghan, B., "WebNFS Client Specification", RFC 2054,
October 1996. October 1996.
[33] Callaghan, B., "WebNFS Server Specification", RFC 2055, [32] Callaghan, B., "WebNFS Server Specification", RFC 2055,
October 1996. October 1996.
[34] Shepler, S., "NFS Version 4 Design Considerations", RFC 2624, [33] Shepler, S., "NFS Version 4 Design Considerations", RFC 2624,
June 1999. June 1999.
[35] Simonsen, K., "Character Mnemonics and Character Sets", [34] Simonsen, K., "Character Mnemonics and Character Sets",
RFC 1345, June 1992. RFC 1345, June 1992.
[36] The Open Group, "Protocols for Interworking: XNFS, Version 3W, [35] The Open Group, "Protocols for Interworking: XNFS, Version 3W,
ISBN 1-85912-184-5", February 1998. ISBN 1-85912-184-5", February 1998.
[37] Floyd, S. and V. Jacobson, "The Synchronization of Periodic [36] Floyd, S. and V. Jacobson, "The Synchronization of Periodic
Routing Messages", IEEE/ACM Transactions on Networking 2(2), Routing Messages", IEEE/ACM Transactions on Networking 2(2),
pp. 122-136, April 1994. pp. 122-136, April 1994.
[38] Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., and E. [37] Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., and E.
Zeidner, "Internet Small Computer Systems Interface (iSCSI)", Zeidner, "Internet Small Computer Systems Interface (iSCSI)",
RFC 3720, April 2004. RFC 3720, April 2004.
[39] Snively, R., "Fibre Channel Protocol for SCSI, 2nd Version [38] Snively, R., "Fibre Channel Protocol for SCSI, 2nd Version
(FCP-2)", ANSI/INCITS 350-2003, Oct 2003. (FCP-2)", ANSI/INCITS 350-2003, Oct 2003.
[40] Weber, R., "Object-Based Storage Device Commands (OSD)", ANSI/ [39] Weber, R., "Object-Based Storage Device Commands (OSD)", ANSI/
INCITS 400-2004, July 2004, INCITS 400-2004, July 2004,
<http://www.t10.org/ftp/t10/drafts/osd/osd-r10.pdf>. <http://www.t10.org/ftp/t10/drafts/osd/osd-r10.pdf>.
[41] The Open Group, "The Open Group Base Specifications Issue 6, [40] The Open Group, "The Open Group Base Specifications Issue 6,
IEEE Std 1003.1, 2004 Edition", 2004. IEEE Std 1003.1, 2004 Edition", 2004.
[42] Callaghan, B., "NFS URL Scheme", RFC 2224, October 1997. [41] Callaghan, B., "NFS URL Scheme", RFC 2224, October 1997.
[43] Chiu, A., Eisler, M., and B. Callaghan, "Security Negotiation [42] Chiu, A., Eisler, M., and B. Callaghan, "Security Negotiation
for WebNFS", RFC 2755, January 2000. for WebNFS", RFC 2755, January 2000.
[43] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA
Considerations Section in RFCs", BCP 26, RFC 5226, May 2008.
Appendix A. Acknowledgments Appendix A. Acknowledgments
The initial drafts for the SECINFO extensions were edited by Mike The initial drafts for the SECINFO extensions were edited by Mike
Eisler with contributions from Peng Dai, Sergey Klyushin, and Carl Eisler with contributions from Peng Dai, Sergey Klyushin, and Carl
Burnett. Burnett.
The initial drafts for the SESSIONS extensions were edited by Tom The initial drafts for the SESSIONS extensions were edited by Tom
Talpey, Spencer Shepler, Jon Bauman with contributions from Charles Talpey, Spencer Shepler, Jon Bauman with contributions from Charles
Antonelli, Brent Callaghan, Mike Eisler, John Howard, Chet Juszczak, Antonelli, Brent Callaghan, Mike Eisler, John Howard, Chet Juszczak,
Trond Myklebust, Dave Noveck, John Scott, Mike Stolarchuk and Mark Trond Myklebust, Dave Noveck, John Scott, Mike Stolarchuk and Mark
skipping to change at page 591, line 39 skipping to change at page 598, line 24
o NFSv4.1 locking and directory delegations, with the following o NFSv4.1 locking and directory delegations, with the following
inspectors: Mike Eisler, Pranoop Erasani, Robert Gordon, Saadia inspectors: Mike Eisler, Pranoop Erasani, Robert Gordon, Saadia
Khan, Eric Kustarz, Dave Noveck, Spencer Shepler, and Amy Weaver. Khan, Eric Kustarz, Dave Noveck, Spencer Shepler, and Amy Weaver.
o EXCHANGE_ID and DESTROY_CLIENTID, with the following inspectors: o EXCHANGE_ID and DESTROY_CLIENTID, with the following inspectors:
Mike Eisler, Pranoop Erasani, Robert Gordon, Benny Halevy, Fred Mike Eisler, Pranoop Erasani, Robert Gordon, Benny Halevy, Fred
Isaman, Saadia Khan, Rick Macklem, Spencer Shepler, and Brent Isaman, Saadia Khan, Rick Macklem, Spencer Shepler, and Brent
Welch. Welch.
o Final pNFS inspection, with the following inspectors: Andy o Final pNFS inspection, with the following inspectors: Andy
Adamson, Mike Eisler, Sam Falkner, Mark Eshel, Jason Glasgow, Adamson, Mike Eisler, Mark Eshel, Sam Falkner, Jason Glasgow,
Garth Goodson, Robert Gordon, Benny Halevy, Dean Hildebrand, Rahul Garth Goodson, Robert Gordon, Benny Halevy, Dean Hildebrand, Rahul
Iyer, Suchit Kaura, Trond Myklebust, Anatoly Pinchuk, Spencer Iyer, Suchit Kaura, Trond Myklebust, Anatoly Pinchuk, Spencer
Shepler, Renu Tewari, Lisa Week, and Brent Welch. Shepler, Renu Tewari, Lisa Week, and Brent Welch.
A review team worked together to generate the tables of assignments A review team worked together to generate the tables of assignments
of error sets to operations and make sure that each such assignment of error sets to operations and make sure that each such assignment
had two or more people validating it. Participating in the process had two or more people validating it. Participating in the process
were: Andy Adamson, Mike Eisler, Sam Falkner, Garth Goodson, Robert were: Andy Adamson, Mike Eisler, Sam Falkner, Garth Goodson, Robert
Gordon, Trond Myklebust, Dave Noveck Spencer Shepler, Tom Talpey, Amy Gordon, Trond Myklebust, Dave Noveck Spencer Shepler, Tom Talpey, Amy
Weaver, and Lisa Week. Weaver, and Lisa Week.
Lars Eggert provided valuable review and guidance.
Others who provided comments include: Jason Goldschmidt and Mahesh Others who provided comments include: Jason Goldschmidt and Mahesh
Siddheshwar. Siddheshwar.
Appendix B. RFC Editor Notes
[RFC Editor: please remove this section prior to publishing this
document as an RFC]
[RFC Editor: prior to publishing this document as an RFC, please
replace all occurrences of RFCTBD10 with RFCxxxx where xxxx is the
RFC number of this document]
[RFC Editor: prior to publishing this document as an RFC, please
replace all occurrences of RFCTBD20 with RFCyyyy where yyyy is the
RFC number of the document referenced in [30]]
[RFC Editor: prior to publishing this document as an RFC, please
replace all occurrences of RFCTBD30 with RFCzzzz where zzzz is the
RFC number of the document referenced in [29]]
Authors' Addresses Authors' Addresses
Spencer Shepler Spencer Shepler
Storspeed, Inc. Storspeed, Inc.
7808 Moonflower Drive 7808 Moonflower Drive
Austin, TX 78750 Austin, TX 78750
USA USA
Phone: +1-512-402-5811 ext 8530 Phone: +1-512-402-5811 ext 8530
Email: shepler@storspeed.com Email: shepler@storspeed.com
 End of changes. 154 change blocks. 
322 lines changed or deleted 659 lines changed or added

This html diff was produced by rfcdiff 1.35. The latest version is available from http://tools.ietf.org/tools/rfcdiff/