| draft-ietf-nfsv4-minorversion1-13.txt | | draft-ietf-nfsv4-minorversion1-14.txt | |
| | | | |
| NFSv4 S. Shepler | | NFSv4 S. Shepler | |
| Internet-Draft M. Eisler | | Internet-Draft M. Eisler | |
| Intended status: Standards Track D. Noveck | | Intended status: Standards Track D. Noveck | |
|
| Expires: January 2, 2008 Editors | | Expires: March 28, 2008 Editors | |
| | | September 25, 2007 | |
| | | | |
| NFSv4 Minor Version 1 | | NFSv4 Minor Version 1 | |
|
| draft-ietf-nfsv4-minorversion1-13.txt | | draft-ietf-nfsv4-minorversion1-14.txt | |
| | | | |
| Status of this Memo | | Status of this Memo | |
| | | | |
| By submitting this Internet-Draft, each author represents that any | | By submitting this Internet-Draft, each author represents that any | |
| applicable patent or other IPR claims of which he or she is aware | | applicable patent or other IPR claims of which he or she is aware | |
| have been or will be disclosed, and any of which he or she becomes | | have been or will be disclosed, and any of which he or she becomes | |
| aware will be disclosed, in accordance with Section 6 of BCP 79. | | aware will be disclosed, in accordance with Section 6 of BCP 79. | |
| | | | |
| Internet-Drafts are working documents of the Internet Engineering | | Internet-Drafts are working documents of the Internet Engineering | |
| Task Force (IETF), its areas, and its working groups. Note that | | Task Force (IETF), its areas, and its working groups. Note that | |
| | | | |
| skipping to change at page 1, line 33 | | skipping to change at page 1, line 35 | |
| and may be updated, replaced, or obsoleted by other documents at any | | and may be updated, replaced, or obsoleted by other documents at any | |
| time. It is inappropriate to use Internet-Drafts as reference | | time. It is inappropriate to use Internet-Drafts as reference | |
| material or to cite them other than as "work in progress." | | material or to cite them other than as "work in progress." | |
| | | | |
| The list of current Internet-Drafts can be accessed at | | The list of current Internet-Drafts can be accessed at | |
| http://www.ietf.org/ietf/1id-abstracts.txt. | | http://www.ietf.org/ietf/1id-abstracts.txt. | |
| | | | |
| The list of Internet-Draft Shadow Directories can be accessed at | | The list of Internet-Draft Shadow Directories can be accessed at | |
| http://www.ietf.org/shadow.html. | | http://www.ietf.org/shadow.html. | |
| | | | |
|
| This Internet-Draft will expire on January 2, 2008. | | This Internet-Draft will expire on March 28, 2008. | |
| | | | |
| Copyright Notice | | Copyright Notice | |
| | | | |
| Copyright (C) The IETF Trust (2007). | | Copyright (C) The IETF Trust (2007). | |
| | | | |
| Abstract | | Abstract | |
| | | | |
| This Internet-Draft describes NFSv4 minor version one, including | | This Internet-Draft describes NFSv4 minor version one, including | |
| features retained from the base protocol and protocol extensions made | | features retained from the base protocol and protocol extensions made | |
| subsequently. The current draft includes description of the major | | subsequently. The current draft includes description of the major | |
| | | | |
| skipping to change at page 2, line 27 | | skipping to change at page 2, line 27 | |
| 1.2. NFS Version 4 Goals . . . . . . . . . . . . . . . . . . 10 | | 1.2. NFS Version 4 Goals . . . . . . . . . . . . . . . . . . 10 | |
| 1.3. Minor Version 1 Goals . . . . . . . . . . . . . . . . . 11 | | 1.3. Minor Version 1 Goals . . . . . . . . . . . . . . . . . 11 | |
| 1.4. Overview of NFS version 4.1 Features . . . . . . . . . . 11 | | 1.4. Overview of NFS version 4.1 Features . . . . . . . . . . 11 | |
| 1.4.1. RPC and Security . . . . . . . . . . . . . . . . . . 12 | | 1.4.1. RPC and Security . . . . . . . . . . . . . . . . . . 12 | |
| 1.4.2. Protocol Structure . . . . . . . . . . . . . . . . . 12 | | 1.4.2. Protocol Structure . . . . . . . . . . . . . . . . . 12 | |
| 1.4.3. File System Model . . . . . . . . . . . . . . . . . 13 | | 1.4.3. File System Model . . . . . . . . . . . . . . . . . 13 | |
| 1.4.4. Locking Facilities . . . . . . . . . . . . . . . . . 14 | | 1.4.4. Locking Facilities . . . . . . . . . . . . . . . . . 14 | |
| 1.5. General Definitions . . . . . . . . . . . . . . . . . . 15 | | 1.5. General Definitions . . . . . . . . . . . . . . . . . . 15 | |
| 1.6. Differences from NFSv4.0 . . . . . . . . . . . . . . . . 17 | | 1.6. Differences from NFSv4.0 . . . . . . . . . . . . . . . . 17 | |
| 2. Core Infrastructure . . . . . . . . . . . . . . . . . . . . . 17 | | 2. Core Infrastructure . . . . . . . . . . . . . . . . . . . . . 17 | |
|
| 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 18 | | 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 17 | |
| 2.2. RPC and XDR . . . . . . . . . . . . . . . . . . . . . . 18 | | 2.2. RPC and XDR . . . . . . . . . . . . . . . . . . . . . . 18 | |
| 2.2.1. RPC-based Security . . . . . . . . . . . . . . . . . 18 | | 2.2.1. RPC-based Security . . . . . . . . . . . . . . . . . 18 | |
| 2.3. COMPOUND and CB_COMPOUND . . . . . . . . . . . . . . . . 21 | | 2.3. COMPOUND and CB_COMPOUND . . . . . . . . . . . . . . . . 21 | |
| 2.4. Client Identifiers and Client Owners . . . . . . . . . . 22 | | 2.4. Client Identifiers and Client Owners . . . . . . . . . . 22 | |
|
| 2.4.1. Server Release of Client ID . . . . . . . . . . . . 26 | | 2.4.1. Upgrade from NFSv4.0 to NFSv4.1 . . . . . . . . . . 25 | |
| 2.4.2. Resolving Client Owner Conflicts . . . . . . . . . . 26 | | 2.4.2. Server Release of Client ID . . . . . . . . . . . . 26 | |
| | | 2.4.3. Resolving Client Owner Conflicts . . . . . . . . . . 26 | |
| 2.5. Server Owners . . . . . . . . . . . . . . . . . . . . . 27 | | 2.5. Server Owners . . . . . . . . . . . . . . . . . . . . . 27 | |
| 2.6. Security Service Negotiation . . . . . . . . . . . . . . 28 | | 2.6. Security Service Negotiation . . . . . . . . . . . . . . 28 | |
| 2.6.1. NFSv4.1 Security Tuples . . . . . . . . . . . . . . 28 | | 2.6.1. NFSv4.1 Security Tuples . . . . . . . . . . . . . . 28 | |
| 2.6.2. SECINFO and SECINFO_NO_NAME . . . . . . . . . . . . 28 | | 2.6.2. SECINFO and SECINFO_NO_NAME . . . . . . . . . . . . 28 | |
| 2.6.3. Security Error . . . . . . . . . . . . . . . . . . . 29 | | 2.6.3. Security Error . . . . . . . . . . . . . . . . . . . 29 | |
| 2.7. Minor Versioning . . . . . . . . . . . . . . . . . . . . 32 | | 2.7. Minor Versioning . . . . . . . . . . . . . . . . . . . . 32 | |
| 2.8. Non-RPC-based Security Services . . . . . . . . . . . . 34 | | 2.8. Non-RPC-based Security Services . . . . . . . . . . . . 34 | |
| 2.8.1. Authorization . . . . . . . . . . . . . . . . . . . 34 | | 2.8.1. Authorization . . . . . . . . . . . . . . . . . . . 34 | |
| 2.8.2. Auditing . . . . . . . . . . . . . . . . . . . . . . 35 | | 2.8.2. Auditing . . . . . . . . . . . . . . . . . . . . . . 35 | |
| 2.8.3. Intrusion Detection . . . . . . . . . . . . . . . . 35 | | 2.8.3. Intrusion Detection . . . . . . . . . . . . . . . . 35 | |
| | | | |
| skipping to change at page 3, line 9 | | skipping to change at page 3, line 10 | |
| 2.9.2. Client and Server Transport Behavior . . . . . . . . 36 | | 2.9.2. Client and Server Transport Behavior . . . . . . . . 36 | |
| 2.9.3. Ports . . . . . . . . . . . . . . . . . . . . . . . 37 | | 2.9.3. Ports . . . . . . . . . . . . . . . . . . . . . . . 37 | |
| 2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 37 | | 2.10. Session . . . . . . . . . . . . . . . . . . . . . . . . 37 | |
| 2.10.1. Motivation and Overview . . . . . . . . . . . . . . 37 | | 2.10.1. Motivation and Overview . . . . . . . . . . . . . . 37 | |
| 2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 38 | | 2.10.2. NFSv4 Integration . . . . . . . . . . . . . . . . . 38 | |
| 2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 40 | | 2.10.3. Channels . . . . . . . . . . . . . . . . . . . . . . 40 | |
| 2.10.4. Trunking . . . . . . . . . . . . . . . . . . . . . . 41 | | 2.10.4. Trunking . . . . . . . . . . . . . . . . . . . . . . 41 | |
| 2.10.5. Exactly Once Semantics . . . . . . . . . . . . . . . 44 | | 2.10.5. Exactly Once Semantics . . . . . . . . . . . . . . . 44 | |
| 2.10.6. RDMA Considerations . . . . . . . . . . . . . . . . 56 | | 2.10.6. RDMA Considerations . . . . . . . . . . . . . . . . 56 | |
| 2.10.7. Sessions Security . . . . . . . . . . . . . . . . . 59 | | 2.10.7. Sessions Security . . . . . . . . . . . . . . . . . 59 | |
|
| 2.10.8. Session Mechanics - Steady State . . . . . . . . . . 67 | | 2.10.8. The SSV GSS Mechanism . . . . . . . . . . . . . . . 64 | |
| 2.10.9. Session Mechanics - Recovery . . . . . . . . . . . . 69 | | 2.10.9. Session Mechanics - Steady State . . . . . . . . . . 68 | |
| 2.10.10. Parallel NFS and Sessions . . . . . . . . . . . . . 72 | | 2.10.10. Session Mechanics - Recovery . . . . . . . . . . . . 69 | |
| 3. Protocol Data Types . . . . . . . . . . . . . . . . . . . . . 72 | | 2.10.11. Parallel NFS and Sessions . . . . . . . . . . . . . 73 | |
| 3.1. Basic Data Types . . . . . . . . . . . . . . . . . . . . 72 | | 3. Protocol Data Types . . . . . . . . . . . . . . . . . . . . . 73 | |
| 3.2. Structured Data Types . . . . . . . . . . . . . . . . . 74 | | 3.1. Basic Data Types . . . . . . . . . . . . . . . . . . . . 73 | |
| 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 84 | | 3.2. Structured Data Types . . . . . . . . . . . . . . . . . 75 | |
| 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 84 | | 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . . 85 | |
| 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 84 | | 4.1. Obtaining the First Filehandle . . . . . . . . . . . . . 85 | |
| 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 85 | | 4.1.1. Root Filehandle . . . . . . . . . . . . . . . . . . 85 | |
| 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 85 | | 4.1.2. Public Filehandle . . . . . . . . . . . . . . . . . 86 | |
| 4.2.1. General Properties of a Filehandle . . . . . . . . . 85 | | 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . . . 86 | |
| 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 86 | | 4.2.1. General Properties of a Filehandle . . . . . . . . . 86 | |
| 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 86 | | 4.2.2. Persistent Filehandle . . . . . . . . . . . . . . . 87 | |
| 4.3. One Method of Constructing a Volatile Filehandle . . . . 88 | | 4.2.3. Volatile Filehandle . . . . . . . . . . . . . . . . 87 | |
| 4.4. Client Recovery from Filehandle Expiration . . . . . . . 88 | | 4.3. One Method of Constructing a Volatile Filehandle . . . . 89 | |
| 5. File Attributes . . . . . . . . . . . . . . . . . . . . . . . 89 | | 4.4. Client Recovery from Filehandle Expiration . . . . . . . 89 | |
| 5.1. Mandatory Attributes . . . . . . . . . . . . . . . . . . 90 | | 5. File Attributes . . . . . . . . . . . . . . . . . . . . . . . 90 | |
| 5.2. Recommended Attributes . . . . . . . . . . . . . . . . . 91 | | 5.1. Mandatory Attributes . . . . . . . . . . . . . . . . . . 91 | |
| 5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 91 | | 5.2. Recommended Attributes . . . . . . . . . . . . . . . . . 92 | |
| 5.4. Classification of Attributes . . . . . . . . . . . . . . 92 | | 5.3. Named Attributes . . . . . . . . . . . . . . . . . . . . 92 | |
| 5.5. Mandatory Attributes - Definitions . . . . . . . . . . . 93 | | 5.4. Classification of Attributes . . . . . . . . . . . . . . 93 | |
| 5.6. Recommended Attributes - Definitions . . . . . . . . . . 94 | | 5.5. Mandatory Attributes - Definitions . . . . . . . . . . . 94 | |
| 5.7. Time Access . . . . . . . . . . . . . . . . . . . . . . 104 | | 5.6. Recommended Attributes - Definitions . . . . . . . . . . 95 | |
| 5.8. Interpreting owner and owner_group . . . . . . . . . . . 105 | | 5.7. Time Access . . . . . . . . . . . . . . . . . . . . . . 106 | |
| 5.9. Character Case Attributes . . . . . . . . . . . . . . . 107 | | 5.8. Interpreting owner and owner_group . . . . . . . . . . . 107 | |
| 5.10. Quota Attributes . . . . . . . . . . . . . . . . . . . . 107 | | 5.9. Character Case Attributes . . . . . . . . . . . . . . . 109 | |
| 5.11. mounted_on_fileid . . . . . . . . . . . . . . . . . . . 108 | | 5.10. Quota Attributes . . . . . . . . . . . . . . . . . . . . 109 | |
| 5.12. Directory Notification Attributes . . . . . . . . . . . 109 | | 5.11. mounted_on_fileid . . . . . . . . . . . . . . . . . . . 110 | |
| 5.12.1. dir_notif_delay . . . . . . . . . . . . . . . . . . 109 | | 5.12. Directory Notification Attributes . . . . . . . . . . . 111 | |
| 5.12.2. dirent_notif_delay . . . . . . . . . . . . . . . . . 109 | | 5.12.1. dir_notif_delay . . . . . . . . . . . . . . . . . . 111 | |
| 5.13. PNFS Attributes . . . . . . . . . . . . . . . . . . . . 109 | | 5.12.2. dirent_notif_delay . . . . . . . . . . . . . . . . . 111 | |
| 5.13.1. fs_layout_type . . . . . . . . . . . . . . . . . . . 109 | | 5.13. PNFS Attributes . . . . . . . . . . . . . . . . . . . . 111 | |
| 5.13.2. layout_alignment . . . . . . . . . . . . . . . . . . 109 | | 5.13.1. fs_layout_type . . . . . . . . . . . . . . . . . . . 111 | |
| 5.13.3. layout_blksize . . . . . . . . . . . . . . . . . . . 110 | | 5.13.2. layout_alignment . . . . . . . . . . . . . . . . . . 111 | |
| 5.13.4. layout_hint . . . . . . . . . . . . . . . . . . . . 110 | | 5.13.3. layout_blksize . . . . . . . . . . . . . . . . . . . 112 | |
| 5.13.5. layout_type . . . . . . . . . . . . . . . . . . . . 110 | | 5.13.4. layout_hint . . . . . . . . . . . . . . . . . . . . 112 | |
| 5.13.6. mdsthreshold . . . . . . . . . . . . . . . . . . . . 110 | | 5.13.5. layout_type . . . . . . . . . . . . . . . . . . . . 112 | |
| 5.14. Retention Attributes . . . . . . . . . . . . . . . . . . 111 | | 5.13.6. mdsthreshold . . . . . . . . . . . . . . . . . . . . 112 | |
| 6. Security Related Attributes . . . . . . . . . . . . . . . . . 113 | | 5.14. Retention Attributes . . . . . . . . . . . . . . . . . . 113 | |
| 6.1. Goals . . . . . . . . . . . . . . . . . . . . . . . . . 113 | | 6. Security Related Attributes . . . . . . . . . . . . . . . . . 115 | |
| 6.2. File Attributes Discussion . . . . . . . . . . . . . . . 114 | | 6.1. Goals . . . . . . . . . . . . . . . . . . . . . . . . . 115 | |
| 6.2.1. ACL Attributes . . . . . . . . . . . . . . . . . . . 114 | | 6.2. File Attributes Discussion . . . . . . . . . . . . . . . 116 | |
| 6.2.2. dacl and sacl Attributes . . . . . . . . . . . . . . 127 | | 6.2.1. ACL Attributes . . . . . . . . . . . . . . . . . . . 116 | |
| 6.2.3. mode Attribute . . . . . . . . . . . . . . . . . . . 127 | | 6.2.2. dacl and sacl Attributes . . . . . . . . . . . . . . 129 | |
| 6.2.4. mode_set_masked Attribute . . . . . . . . . . . . . 128 | | 6.2.3. mode Attribute . . . . . . . . . . . . . . . . . . . 129 | |
| 6.3. Common Methods . . . . . . . . . . . . . . . . . . . . . 129 | | 6.2.4. mode_set_masked Attribute . . . . . . . . . . . . . 130 | |
| 6.3.1. Interpreting an ACL . . . . . . . . . . . . . . . . 129 | | 6.3. Common Methods . . . . . . . . . . . . . . . . . . . . . 131 | |
| 6.3.2. Computing a Mode Attribute from an ACL . . . . . . . 130 | | 6.3.1. Interpreting an ACL . . . . . . . . . . . . . . . . 131 | |
| 6.4. Requirements . . . . . . . . . . . . . . . . . . . . . . 131 | | 6.3.2. Computing a Mode Attribute from an ACL . . . . . . . 132 | |
| 6.4.1. Setting the mode and/or ACL Attributes . . . . . . . 132 | | 6.4. Requirements . . . . . . . . . . . . . . . . . . . . . . 133 | |
| 6.4.2. Retrieving the mode and/or ACL Attributes . . . . . 133 | | 6.4.1. Setting the mode and/or ACL Attributes . . . . . . . 134 | |
| 6.4.3. Creating New Objects . . . . . . . . . . . . . . . . 134 | | 6.4.2. Retrieving the mode and/or ACL Attributes . . . . . 135 | |
| 7. Single-server Namespace . . . . . . . . . . . . . . . . . . . 138 | | 6.4.3. Creating New Objects . . . . . . . . . . . . . . . . 136 | |
| 7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 138 | | 7. Single-server Namespace . . . . . . . . . . . . . . . . . . . 140 | |
| 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 138 | | 7.1. Server Exports . . . . . . . . . . . . . . . . . . . . . 140 | |
| 7.3. Server Pseudo File System . . . . . . . . . . . . . . . 139 | | 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . . . 140 | |
| 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 139 | | 7.3. Server Pseudo File System . . . . . . . . . . . . . . . 141 | |
| 7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 140 | | 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . . . 141 | |
| 7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 140 | | 7.5. Filehandle Volatility . . . . . . . . . . . . . . . . . 142 | |
| 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 140 | | 7.6. Exported Root . . . . . . . . . . . . . . . . . . . . . 142 | |
| 7.8. Security Policy and Namespace Presentation . . . . . . . 141 | | 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . . . 142 | |
| 8. State Management . . . . . . . . . . . . . . . . . . . . . . 142 | | 7.8. Security Policy and Namespace Presentation . . . . . . . 143 | |
| 8.1. Client and Session ID . . . . . . . . . . . . . . . . . 142 | | 8. State Management . . . . . . . . . . . . . . . . . . . . . . 144 | |
| 8.2. Stateid Definition . . . . . . . . . . . . . . . . . . . 143 | | 8.1. Client and Session ID . . . . . . . . . . . . . . . . . 145 | |
| 8.2.1. Stateid Types . . . . . . . . . . . . . . . . . . . 143 | | 8.2. Stateid Definition . . . . . . . . . . . . . . . . . . . 145 | |
| 8.2.2. Stateid Structure . . . . . . . . . . . . . . . . . 144 | | 8.2.1. Stateid Types . . . . . . . . . . . . . . . . . . . 145 | |
| 8.2.3. Special Stateids . . . . . . . . . . . . . . . . . . 145 | | 8.2.2. Stateid Structure . . . . . . . . . . . . . . . . . 146 | |
| 8.2.4. Stateid Lifetime and Validation . . . . . . . . . . 146 | | 8.2.3. Special Stateids . . . . . . . . . . . . . . . . . . 147 | |
| 8.3. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 148 | | 8.2.4. Stateid Lifetime and Validation . . . . . . . . . . 148 | |
| 8.4. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 149 | | 8.3. Lease Renewal . . . . . . . . . . . . . . . . . . . . . 150 | |
| 8.4.1. Client Failure and Recovery . . . . . . . . . . . . 149 | | 8.4. Crash Recovery . . . . . . . . . . . . . . . . . . . . . 151 | |
| 8.4.2. Server Failure and Recovery . . . . . . . . . . . . 150 | | 8.4.1. Client Failure and Recovery . . . . . . . . . . . . 151 | |
| 8.4.3. Network Partitions and Recovery . . . . . . . . . . 154 | | 8.4.2. Server Failure and Recovery . . . . . . . . . . . . 152 | |
| 8.5. Server Revocation of Locks . . . . . . . . . . . . . . . 158 | | 8.4.3. Network Partitions and Recovery . . . . . . . . . . 156 | |
| 8.6. Short and Long Leases . . . . . . . . . . . . . . . . . 159 | | 8.5. Server Revocation of Locks . . . . . . . . . . . . . . . 160 | |
| | | 8.6. Short and Long Leases . . . . . . . . . . . . . . . . . 161 | |
| 8.7. Clocks, Propagation Delay, and Calculating Lease | | 8.7. Clocks, Propagation Delay, and Calculating Lease | |
|
| Expiration . . . . . . . . . . . . . . . . . . . . . . . 159 | | Expiration . . . . . . . . . . . . . . . . . . . . . . . 162 | |
| 8.8. Vestigial Locking Infrastructure From V4.0 . . . . . . . 160 | | 8.8. Vestigial Locking Infrastructure From V4.0 . . . . . . . 162 | |
| 9. File Locking and Share Reservations . . . . . . . . . . . . . 161 | | 9. File Locking and Share Reservations . . . . . . . . . . . . . 163 | |
| 9.1. Opens and Byte-range Locks . . . . . . . . . . . . . . . 161 | | 9.1. Opens and Byte-range Locks . . . . . . . . . . . . . . . 163 | |
| 9.1.1. State-owner Definition . . . . . . . . . . . . . . . 161 | | 9.1.1. State-owner Definition . . . . . . . . . . . . . . . 164 | |
| 9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 162 | | 9.1.2. Use of the Stateid and Locking . . . . . . . . . . . 164 | |
| 9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 165 | | 9.2. Lock Ranges . . . . . . . . . . . . . . . . . . . . . . 167 | |
| 9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 165 | | 9.3. Upgrading and Downgrading Locks . . . . . . . . . . . . 167 | |
| 9.4. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 166 | | 9.4. Blocking Locks . . . . . . . . . . . . . . . . . . . . . 168 | |
| 9.5. Share Reservations . . . . . . . . . . . . . . . . . . . 167 | | 9.5. Share Reservations . . . . . . . . . . . . . . . . . . . 169 | |
| 9.6. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 167 | | 9.6. OPEN/CLOSE Operations . . . . . . . . . . . . . . . . . 169 | |
| 9.7. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 168 | | 9.7. Open Upgrade and Downgrade . . . . . . . . . . . . . . . 170 | |
| 9.8. Reclaim of Open and Byte-range Locks . . . . . . . . . . 169 | | 9.8. Reclaim of Open and Byte-range Locks . . . . . . . . . . 171 | |
| 10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 169 | | 10. Client-Side Caching . . . . . . . . . . . . . . . . . . . . . 171 | |
| 10.1. Performance Challenges for Client-Side Caching . . . . . 170 | | 10.1. Performance Challenges for Client-Side Caching . . . . . 172 | |
| 10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 171 | | 10.2. Delegation and Callbacks . . . . . . . . . . . . . . . . 173 | |
| 10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 172 | | 10.2.1. Delegation Recovery . . . . . . . . . . . . . . . . 174 | |
| 10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 174 | | 10.3. Data Caching . . . . . . . . . . . . . . . . . . . . . . 177 | |
| 10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 175 | | 10.3.1. Data Caching and OPENs . . . . . . . . . . . . . . . 177 | |
| 10.3.2. Data Caching and File Locking . . . . . . . . . . . 176 | | 10.3.2. Data Caching and File Locking . . . . . . . . . . . 178 | |
| 10.3.3. Data Caching and Mandatory File Locking . . . . . . 177 | | 10.3.3. Data Caching and Mandatory File Locking . . . . . . 180 | |
| 10.3.4. Data Caching and File Identity . . . . . . . . . . . 178 | | 10.3.4. Data Caching and File Identity . . . . . . . . . . . 180 | |
| 10.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 179 | | 10.4. Open Delegation . . . . . . . . . . . . . . . . . . . . 181 | |
| 10.4.1. Open Delegation and Data Caching . . . . . . . . . . 181 | | 10.4.1. Open Delegation and Data Caching . . . . . . . . . . 183 | |
| 10.4.2. Open Delegation and File Locks . . . . . . . . . . . 182 | | 10.4.2. Open Delegation and File Locks . . . . . . . . . . . 185 | |
| 10.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 183 | | 10.4.3. Handling of CB_GETATTR . . . . . . . . . . . . . . . 185 | |
| 10.4.4. Recall of Open Delegation . . . . . . . . . . . . . 186 | | 10.4.4. Recall of Open Delegation . . . . . . . . . . . . . 188 | |
| 10.4.5. Clients that Fail to Honor Delegation Recalls . . . 188 | | 10.4.5. Clients that Fail to Honor Delegation Recalls . . . 190 | |
| 10.4.6. Delegation Revocation . . . . . . . . . . . . . . . 189 | | 10.4.6. Delegation Revocation . . . . . . . . . . . . . . . 191 | |
| 10.4.7. Delegations via WANT_DELEGATION . . . . . . . . . . 189 | | 10.4.7. Delegations via WANT_DELEGATION . . . . . . . . . . 191 | |
| 10.5. Data Caching and Revocation . . . . . . . . . . . . . . 189 | | 10.5. Data Caching and Revocation . . . . . . . . . . . . . . 192 | |
| 10.5.1. Revocation Recovery for Write Open Delegation . . . 190 | | 10.5.1. Revocation Recovery for Write Open Delegation . . . 192 | |
| 10.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 191 | | 10.6. Attribute Caching . . . . . . . . . . . . . . . . . . . 193 | |
| 10.7. Data and Metadata Caching and Memory Mapped Files . . . 193 | | 10.7. Data and Metadata Caching and Memory Mapped Files . . . 195 | |
| 10.8. Name Caching . . . . . . . . . . . . . . . . . . . . . . 195 | | 10.8. Name Caching . . . . . . . . . . . . . . . . . . . . . . 197 | |
| 10.9. Directory Caching . . . . . . . . . . . . . . . . . . . 196 | | 10.9. Directory Caching . . . . . . . . . . . . . . . . . . . 198 | |
| 11. Multi-Server Namespace . . . . . . . . . . . . . . . . . . . 197 | | 11. Multi-Server Namespace . . . . . . . . . . . . . . . . . . . 199 | |
| 11.1. Location attributes . . . . . . . . . . . . . . . . . . 197 | | 11.1. Location Attributes . . . . . . . . . . . . . . . . . . 199 | |
| 11.2. File System Presence or Absence . . . . . . . . . . . . 197 | | 11.2. File System Presence or Absence . . . . . . . . . . . . 200 | |
| 11.3. Getting Attributes for an Absent File System . . . . . . 199 | | 11.3. Getting Attributes for an Absent File System . . . . . . 201 | |
| 11.3.1. GETATTR Within an Absent File System . . . . . . . . 199 | | 11.3.1. GETATTR Within an Absent File System . . . . . . . . 201 | |
| 11.3.2. READDIR and Absent File Systems . . . . . . . . . . 200 | | 11.3.2. READDIR and Absent File Systems . . . . . . . . . . 202 | |
| 11.4. Uses of Location Information . . . . . . . . . . . . . . 201 | | 11.4. Uses of Location Information . . . . . . . . . . . . . . 203 | |
| 11.4.1. File System Replication . . . . . . . . . . . . . . 201 | | 11.4.1. File System Replication . . . . . . . . . . . . . . 204 | |
| 11.4.2. File System Migration . . . . . . . . . . . . . . . 203 | | 11.4.2. File System Migration . . . . . . . . . . . . . . . 205 | |
| 11.4.3. Referrals . . . . . . . . . . . . . . . . . . . . . 204 | | 11.4.3. Referrals . . . . . . . . . . . . . . . . . . . . . 206 | |
| 11.5. Additional Client-side Considerations . . . . . . . . . 205 | | 11.5. Additional Client-side Considerations . . . . . . . . . 207 | |
| 11.6. Effecting File System Transitions . . . . . . . . . . . 206 | | 11.6. Effecting File System Transitions . . . . . . . . . . . 208 | |
| 11.6.1. File System Transitions and Simultaneous Access . . 207 | | 11.6.1. File System Transitions and Simultaneous Access . . 209 | |
| 11.6.2. Simultaneous Use and Transparent Transitions . . . . 208 | | 11.6.2. Simultaneous Use and Transparent Transitions . . . . 210 | |
| 11.6.3. Filehandles and File System Transitions . . . . . . 210 | | 11.6.3. Filehandles and File System Transitions . . . . . . 212 | |
| 11.6.4. Fileid's and File System Transitions . . . . . . . . 210 | | 11.6.4. Fileids and File System Transitions . . . . . . . . 213 | |
| 11.6.5. Fsids and File System Transitions . . . . . . . . . 211 | | 11.6.5. Fsids and File System Transitions . . . . . . . . . 214 | |
| 11.6.6. The Change Attribute and File System Transitions . . 211 | | 11.6.6. The Change Attribute and File System Transitions . . 215 | |
| 11.6.7. Lock State and File System Transitions . . . . . . . 212 | | 11.6.7. Lock State and File System Transitions . . . . . . . 215 | |
| 11.6.8. Write Verifiers and File System Transitions . . . . 216 | | 11.6.8. Write Verifiers and File System Transitions . . . . 219 | |
| 11.7. Effecting File System Referrals . . . . . . . . . . . . 216 | | 11.6.9. Readdir Cookies and Verifiers and File System | |
| 11.7.1. Referral Example (LOOKUP) . . . . . . . . . . . . . 216 | | Transitions . . . . . . . . . . . . . . . . . . . . 219 | |
| 11.7.2. Referral Example (READDIR) . . . . . . . . . . . . . 220 | | 11.6.10. File System Data and File System Transitions . . . . 220 | |
| 11.8. The Attribute fs_absent . . . . . . . . . . . . . . . . 223 | | 11.7. Effecting File System Referrals . . . . . . . . . . . . 221 | |
| 11.9. The Attribute fs_locations . . . . . . . . . . . . . . . 223 | | 11.7.1. Referral Example (LOOKUP) . . . . . . . . . . . . . 222 | |
| 11.10. The Attribute fs_locations_info . . . . . . . . . . . . 225 | | 11.7.2. Referral Example (READDIR) . . . . . . . . . . . . . 225 | |
| 11.10.1. The fs_locations_server4 Structure . . . . . . . . . 228 | | 11.8. The Attribute fs_locations . . . . . . . . . . . . . . . 228 | |
| 11.10.2. The fs_locations_info4 Structure . . . . . . . . . . 233 | | 11.9. The Attribute fs_locations_info . . . . . . . . . . . . 230 | |
| 11.10.3. The fs_locations_item4 Structure . . . . . . . . . . 234 | | 11.9.1. The fs_locations_server4 Structure . . . . . . . . . 233 | |
| 11.11. The Attribute fs_status . . . . . . . . . . . . . . . . 235 | | 11.9.2. The fs_locations_info4 Structure . . . . . . . . . . 239 | |
| 12. Directory Delegations . . . . . . . . . . . . . . . . . . . . 239 | | 11.9.3. The fs_locations_item4 Structure . . . . . . . . . . 240 | |
| 12.1. Introduction to Directory Delegations . . . . . . . . . 239 | | 11.10. The Attribute fs_status . . . . . . . . . . . . . . . . 242 | |
| 12.2. Directory Delegation Design . . . . . . . . . . . . . . 240 | | 12. Directory Delegations . . . . . . . . . . . . . . . . . . . . 245 | |
| 12.3. Attributes in Support of Directory Notifications . . . . 241 | | 12.1. Introduction to Directory Delegations . . . . . . . . . 245 | |
| 12.4. Delegation Recall . . . . . . . . . . . . . . . . . . . 241 | | 12.2. Directory Delegation Design . . . . . . . . . . . . . . 246 | |
| 12.5. Directory Delegation Recovery . . . . . . . . . . . . . 241 | | 12.3. Attributes in Support of Directory Notifications . . . . 247 | |
| 13. Parallel NFS (pNFS) . . . . . . . . . . . . . . . . . . . . . 241 | | 12.4. Delegation Recall . . . . . . . . . . . . . . . . . . . 247 | |
| 13.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 241 | | 12.5. Directory Delegation Recovery . . . . . . . . . . . . . 248 | |
| 13.2. pNFS Definitions . . . . . . . . . . . . . . . . . . . . 243 | | 13. Parallel NFS (pNFS) . . . . . . . . . . . . . . . . . . . . . 248 | |
| 13.2.1. Metadata . . . . . . . . . . . . . . . . . . . . . . 243 | | 13.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 248 | |
| 13.2.2. Metadata Server . . . . . . . . . . . . . . . . . . 243 | | 13.2. pNFS Definitions . . . . . . . . . . . . . . . . . . . . 250 | |
| 13.2.3. pNFS Client . . . . . . . . . . . . . . . . . . . . 244 | | 13.2.1. Metadata . . . . . . . . . . . . . . . . . . . . . . 250 | |
| 13.2.4. Storage Device . . . . . . . . . . . . . . . . . . . 244 | | 13.2.2. Metadata Server . . . . . . . . . . . . . . . . . . 250 | |
| 13.2.5. Storage Protocol . . . . . . . . . . . . . . . . . . 244 | | 13.2.3. pNFS Client . . . . . . . . . . . . . . . . . . . . 251 | |
| 13.2.6. Control Protocol . . . . . . . . . . . . . . . . . . 244 | | 13.2.4. Storage Device . . . . . . . . . . . . . . . . . . . 251 | |
| 13.2.7. Layout Types . . . . . . . . . . . . . . . . . . . . 244 | | 13.2.5. Storage Protocol . . . . . . . . . . . . . . . . . . 251 | |
| 13.2.8. Layout . . . . . . . . . . . . . . . . . . . . . . . 245 | | 13.2.6. Control Protocol . . . . . . . . . . . . . . . . . . 251 | |
| 13.2.9. Layout Iomode . . . . . . . . . . . . . . . . . . . 245 | | 13.2.7. Layout Types . . . . . . . . . . . . . . . . . . . . 251 | |
| 13.2.10. Device IDs . . . . . . . . . . . . . . . . . . . . . 246 | | 13.2.8. Layout . . . . . . . . . . . . . . . . . . . . . . . 252 | |
| 13.3. pNFS Operations . . . . . . . . . . . . . . . . . . . . 246 | | 13.2.9. Layout Iomode . . . . . . . . . . . . . . . . . . . 252 | |
| 13.4. pNFS Attributes . . . . . . . . . . . . . . . . . . . . 247 | | 13.2.10. Device IDs . . . . . . . . . . . . . . . . . . . . . 253 | |
| 13.5. Layout Semantics . . . . . . . . . . . . . . . . . . . . 247 | | 13.3. pNFS Operations . . . . . . . . . . . . . . . . . . . . 253 | |
| 13.5.1. Guarantees Provided by Layouts . . . . . . . . . . . 248 | | 13.4. pNFS Attributes . . . . . . . . . . . . . . . . . . . . 254 | |
| 13.5.2. Getting a Layout . . . . . . . . . . . . . . . . . . 249 | | 13.5. Layout Semantics . . . . . . . . . . . . . . . . . . . . 254 | |
| 13.5.3. Committing a Layout . . . . . . . . . . . . . . . . 250 | | 13.5.1. Guarantees Provided by Layouts . . . . . . . . . . . 254 | |
| 13.5.4. Recalling a Layout . . . . . . . . . . . . . . . . . 253 | | 13.5.2. Getting a Layout . . . . . . . . . . . . . . . . . . 256 | |
| 13.5.5. Metadata Server Write Propagation . . . . . . . . . 259 | | 13.5.3. Committing a Layout . . . . . . . . . . . . . . . . 257 | |
| 13.6. pNFS Mechanics . . . . . . . . . . . . . . . . . . . . . 259 | | 13.5.4. Recalling a Layout . . . . . . . . . . . . . . . . . 259 | |
| 13.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 260 | | 13.5.5. Metadata Server Write Propagation . . . . . . . . . 265 | |
| 13.7.1. Client Recovery . . . . . . . . . . . . . . . . . . 261 | | 13.6. pNFS Mechanics . . . . . . . . . . . . . . . . . . . . . 266 | |
| 13.7.2. Dealing with Lease Expiration on the Client . . . . 261 | | 13.7. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 267 | |
| | | 13.7.1. Client Recovery . . . . . . . . . . . . . . . . . . 267 | |
| | | 13.7.2. Dealing with Lease Expiration on the Client . . . . 268 | |
| 13.7.3. Dealing with Loss of Layout State on the Metadata | | 13.7.3. Dealing with Loss of Layout State on the Metadata | |
|
| Server . . . . . . . . . . . . . . . . . . . . . . . 263 | | Server . . . . . . . . . . . . . . . . . . . . . . . 269 | |
| 13.7.4. Recovery from Metadata Server Restart . . . . . . . 263 | | 13.7.4. Recovery from Metadata Server Restart . . . . . . . 270 | |
| 13.7.5. Operations During Metadata Server Grace Period . . . 265 | | 13.7.5. Operations During Metadata Server Grace Period . . . 272 | |
| 13.7.6. Storage Device Recovery . . . . . . . . . . . . . . 266 | | 13.7.6. Storage Device Recovery . . . . . . . . . . . . . . 272 | |
| 13.8. Metadata and Storage Device Roles . . . . . . . . . . . 266 | | 13.8. Metadata and Storage Device Roles . . . . . . . . . . . 273 | |
| 13.9. Security Considerations . . . . . . . . . . . . . . . . 268 | | 13.9. Security Considerations . . . . . . . . . . . . . . . . 274 | |
| 14. PNFS: NFSv4.1 File Layout Type . . . . . . . . . . . . . . . 269 | | 14. PNFS: NFSv4.1 File Layout Type . . . . . . . . . . . . . . . 275 | |
| 14.1. Client ID and Session Considerations . . . . . . . . . . 269 | | 14.1. Client ID and Session Considerations . . . . . . . . . . 275 | |
| 14.2. File Layout Definitions . . . . . . . . . . . . . . . . 270 | | 14.2. File Layout Definitions . . . . . . . . . . . . . . . . 277 | |
| 14.3. File Layout Data Types . . . . . . . . . . . . . . . . . 271 | | 14.3. File Layout Data Types . . . . . . . . . . . . . . . . . 278 | |
| 14.4. Interpreting the File Layout . . . . . . . . . . . . . . 274 | | 14.4. Interpreting the File Layout . . . . . . . . . . . . . . 280 | |
| 14.5. Sparse and Dense Stripe Unit Packing . . . . . . . . . . 276 | | 14.5. Sparse and Dense Stripe Unit Packing . . . . . . . . . . 283 | |
| 14.6. Data Server Multipathing . . . . . . . . . . . . . . . . 277 | | 14.6. Data Server Multipathing . . . . . . . . . . . . . . . . 284 | |
| 14.7. Operations Issued to NFSv4.1 Data Servers . . . . . . . 278 | | 14.7. Operations Issued to NFSv4.1 Data Servers . . . . . . . 285 | |
| 14.8. COMMIT Through Metadata Server . . . . . . . . . . . . . 279 | | 14.8. COMMIT Through Metadata Server . . . . . . . . . . . . . 285 | |
| 14.9. The Layout Iomode . . . . . . . . . . . . . . . . . . . 280 | | 14.9. The Layout Iomode . . . . . . . . . . . . . . . . . . . 286 | |
| 14.10. Metadata and Data Server State Coordination . . . . . . 280 | | 14.10. Metadata and Data Server State Coordination . . . . . . 287 | |
| 14.10.1. Global Stateid Requirements . . . . . . . . . . . . 280 | | 14.10.1. Global Stateid Requirements . . . . . . . . . . . . 287 | |
| 14.10.2. Data Server State Propagation . . . . . . . . . . . 280 | | 14.10.2. Data Server State Propagation . . . . . . . . . . . 287 | |
| 14.11. Data Server Component File Size . . . . . . . . . . . . 283 | | 14.11. Data Server Component File Size . . . . . . . . . . . . 289 | |
| 14.12. Recovery from Loss of Layout . . . . . . . . . . . . . . 283 | | 14.12. Recovery from Loss of Layout . . . . . . . . . . . . . . 290 | |
| 14.13. Security Considerations for the File Layout Type . . . . 284 | | 14.13. Security Considerations for the File Layout Type . . . . 291 | |
| 15. Internationalization . . . . . . . . . . . . . . . . . . . . 284 | | 15. Internationalization . . . . . . . . . . . . . . . . . . . . 291 | |
| 15.1. Stringprep profile for the utf8str_cs type . . . . . . . 286 | | 15.1. Stringprep profile for the utf8str_cs type . . . . . . . 292 | |
| 15.2. Stringprep profile for the utf8str_cis type . . . . . . 287 | | 15.2. Stringprep profile for the utf8str_cis type . . . . . . 294 | |
| 15.3. Stringprep profile for the utf8str_mixed type . . . . . 289 | | 15.3. Stringprep profile for the utf8str_mixed type . . . . . 295 | |
| 15.4. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 290 | | 15.4. UTF-8 Related Errors . . . . . . . . . . . . . . . . . . 297 | |
| 16. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 290 | | 16. Error Values . . . . . . . . . . . . . . . . . . . . . . . . 297 | |
| 16.1. Error Definitions . . . . . . . . . . . . . . . . . . . 291 | | 16.1. Error Definitions . . . . . . . . . . . . . . . . . . . 297 | |
| 16.2. Operations and their valid errors . . . . . . . . . . . 305 | | 16.2. Operations and their valid errors . . . . . . . . . . . 312 | |
| 16.3. Callback operations and their valid errors . . . . . . . 319 | | 16.3. Callback operations and their valid errors . . . . . . . 326 | |
| 16.4. Errors and the operations that use them . . . . . . . . 320 | | 16.4. Errors and the operations that use them . . . . . . . . 327 | |
| 17. NFS version 4.1 Procedures . . . . . . . . . . . . . . . . . 327 | | 17. NFS version 4.1 Procedures . . . . . . . . . . . . . . . . . 334 | |
| 17.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 327 | | 17.1. Procedure 0: NULL - No Operation . . . . . . . . . . . . 335 | |
| 17.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 328 | | 17.2. Procedure 1: COMPOUND - Compound Operations . . . . . . 335 | |
| 18. NFS version 4.1 Operations . . . . . . . . . . . . . . . . . 333 | | 18. NFS version 4.1 Operations . . . . . . . . . . . . . . . . . 340 | |
| 18.1. Operation 3: ACCESS - Check Access Rights . . . . . . . 333 | | 18.1. Operation 3: ACCESS - Check Access Rights . . . . . . . 340 | |
| 18.2. Operation 4: CLOSE - Close File . . . . . . . . . . . . 335 | | 18.2. Operation 4: CLOSE - Close File . . . . . . . . . . . . 342 | |
| 18.3. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 337 | | 18.3. Operation 5: COMMIT - Commit Cached Data . . . . . . . . 344 | |
| 18.4. Operation 6: CREATE - Create a Non-Regular File Object . 339 | | 18.4. Operation 6: CREATE - Create a Non-Regular File Object . 346 | |
| 18.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting | | 18.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting | |
|
| Recovery . . . . . . . . . . . . . . . . . . . . . . . . 342 | | Recovery . . . . . . . . . . . . . . . . . . . . . . . . 349 | |
| 18.6. Operation 8: DELEGRETURN - Return Delegation . . . . . . 343 | | 18.6. Operation 8: DELEGRETURN - Return Delegation . . . . . . 350 | |
| 18.7. Operation 9: GETATTR - Get Attributes . . . . . . . . . 343 | | 18.7. Operation 9: GETATTR - Get Attributes . . . . . . . . . 350 | |
| 18.8. Operation 10: GETFH - Get Current Filehandle . . . . . . 345 | | 18.8. Operation 10: GETFH - Get Current Filehandle . . . . . . 352 | |
| 18.9. Operation 11: LINK - Create Link to a File . . . . . . . 346 | | 18.9. Operation 11: LINK - Create Link to a File . . . . . . . 353 | |
| 18.10. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 347 | | 18.10. Operation 12: LOCK - Create Lock . . . . . . . . . . . . 354 | |
| 18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 351 | | 18.11. Operation 13: LOCKT - Test For Lock . . . . . . . . . . 358 | |
| 18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 352 | | 18.12. Operation 14: LOCKU - Unlock File . . . . . . . . . . . 359 | |
| 18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 354 | | 18.13. Operation 15: LOOKUP - Lookup Filename . . . . . . . . . 361 | |
| 18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 356 | | 18.14. Operation 16: LOOKUPP - Lookup Parent Directory . . . . 363 | |
| 18.15. Operation 17: NVERIFY - Verify Difference in | | 18.15. Operation 17: NVERIFY - Verify Difference in | |
|
| Attributes . . . . . . . . . . . . . . . . . . . . . . . 357 | | Attributes . . . . . . . . . . . . . . . . . . . . . . . 364 | |
| 18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 358 | | 18.16. Operation 18: OPEN - Open a Regular File . . . . . . . . 365 | |
| 18.17. Operation 19: OPENATTR - Open Named Attribute | | 18.17. Operation 19: OPENATTR - Open Named Attribute | |
|
| Directory . . . . . . . . . . . . . . . . . . . . . . . 373 | | Directory . . . . . . . . . . . . . . . . . . . . . . . 380 | |
| 18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 374 | | 18.18. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access . 381 | |
| 18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 375 | | 18.19. Operation 22: PUTFH - Set Current Filehandle . . . . . . 383 | |
| 18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 376 | | 18.20. Operation 23: PUTPUBFH - Set Public Filehandle . . . . . 383 | |
| 18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 378 | | 18.21. Operation 24: PUTROOTFH - Set Root Filehandle . . . . . 385 | |
| 18.22. Operation 25: READ - Read from File . . . . . . . . . . 379 | | 18.22. Operation 25: READ - Read from File . . . . . . . . . . 386 | |
| 18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 381 | | 18.23. Operation 26: READDIR - Read Directory . . . . . . . . . 388 | |
| 18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 385 | | 18.24. Operation 27: READLINK - Read Symbolic Link . . . . . . 392 | |
| 18.25. Operation 28: REMOVE - Remove File System Object . . . . 386 | | 18.25. Operation 28: REMOVE - Remove File System Object . . . . 393 | |
| 18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 388 | | 18.26. Operation 29: RENAME - Rename Directory Entry . . . . . 395 | |
| 18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 390 | | 18.27. Operation 31: RESTOREFH - Restore Saved Filehandle . . . 397 | |
| 18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 391 | | 18.28. Operation 32: SAVEFH - Save Current Filehandle . . . . . 398 | |
| 18.29. Operation 33: SECINFO - Obtain Available Security . . . 391 | | 18.29. Operation 33: SECINFO - Obtain Available Security . . . 398 | |
| 18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 395 | | 18.30. Operation 34: SETATTR - Set Attributes . . . . . . . . . 402 | |
| 18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 397 | | 18.31. Operation 37: VERIFY - Verify Same Attributes . . . . . 404 | |
| 18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 398 | | 18.32. Operation 38: WRITE - Write to File . . . . . . . . . . 405 | |
| 18.33. Operation 40: BACKCHANNEL_CTL - Backchannel control . . 403 | | 18.33. Operation 40: BACKCHANNEL_CTL - Backchannel control . . 410 | |
| 18.34. Operation 41: BIND_CONN_TO_SESSION . . . . . . . . . . . 404 | | 18.34. Operation 41: BIND_CONN_TO_SESSION . . . . . . . . . . . 411 | |
| 18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 406 | | 18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID . . . 413 | |
| 18.36. Operation 43: CREATE_SESSION - Create New Session and | | 18.36. Operation 43: CREATE_SESSION - Create New Session and | |
|
| Confirm Client ID . . . . . . . . . . . . . . . . . . . 423 | | Confirm Client ID . . . . . . . . . . . . . . . . . . . 430 | |
| 18.37. Operation 44: DESTROY_SESSION - Destroy existing | | 18.37. Operation 44: DESTROY_SESSION - Destroy existing | |
|
| session . . . . . . . . . . . . . . . . . . . . . . . . 433 | | session . . . . . . . . . . . . . . . . . . . . . . . . 440 | |
| 18.38. Operation 45: FREE_STATEID - Free stateid with no | | 18.38. Operation 45: FREE_STATEID - Free stateid with no | |
|
| locks . . . . . . . . . . . . . . . . . . . . . . . . . 435 | | locks . . . . . . . . . . . . . . . . . . . . . . . . . 442 | |
| 18.39. Operation 46: GET_DIR_DELEGATION - Get a directory | | 18.39. Operation 46: GET_DIR_DELEGATION - Get a directory | |
|
| delegation . . . . . . . . . . . . . . . . . . . . . . . 436 | | delegation . . . . . . . . . . . . . . . . . . . . . . . 443 | |
| 18.40. Operation 47: GETDEVICEINFO - Get Device Information . . 440 | | 18.40. Operation 47: GETDEVICEINFO - Get Device Information . . 447 | |
| 18.41. Operation 48: GETDEVICELIST . . . . . . . . . . . . . . 441 | | 18.41. Operation 48: GETDEVICELIST . . . . . . . . . . . . . . 448 | |
| 18.42. Operation 49: LAYOUTCOMMIT - Commit writes made using | | 18.42. Operation 49: LAYOUTCOMMIT - Commit writes made using | |
|
| a layout . . . . . . . . . . . . . . . . . . . . . . . . 442 | | a layout . . . . . . . . . . . . . . . . . . . . . . . . 449 | |
| 18.43. Operation 50: LAYOUTGET - Get Layout Information . . . . 445 | | 18.43. Operation 50: LAYOUTGET - Get Layout Information . . . . 452 | |
| 18.44. Operation 51: LAYOUTRETURN - Release Layout | | 18.44. Operation 51: LAYOUTRETURN - Release Layout | |
|
| Information . . . . . . . . . . . . . . . . . . . . . . 448 | | Information . . . . . . . . . . . . . . . . . . . . . . 455 | |
| 18.45. Operation 52: SECINFO_NO_NAME - Get Security on | | 18.45. Operation 52: SECINFO_NO_NAME - Get Security on | |
|
| Unnamed Object . . . . . . . . . . . . . . . . . . . . . 451 | | Unnamed Object . . . . . . . . . . . . . . . . . . . . . 458 | |
| 18.46. Operation 53: SEQUENCE - Supply per-procedure | | 18.46. Operation 53: SEQUENCE - Supply per-procedure | |
|
| sequencing and control . . . . . . . . . . . . . . . . . 452 | | sequencing and control . . . . . . . . . . . . . . . . . 459 | |
| 18.47. Operation 54: SET_SSV . . . . . . . . . . . . . . . . . 459 | | 18.47. Operation 54: SET_SSV . . . . . . . . . . . . . . . . . 466 | |
| 18.48. Operation 55: TEST_STATEID - Test stateids for | | 18.48. Operation 55: TEST_STATEID - Test stateids for | |
|
| validity . . . . . . . . . . . . . . . . . . . . . . . . 461 | | validity . . . . . . . . . . . . . . . . . . . . . . . . 468 | |
| 18.49. Operation 56: WANT_DELEGATION . . . . . . . . . . . . . 462 | | 18.49. Operation 56: WANT_DELEGATION . . . . . . . . . . . . . 470 | |
| 18.50. Operation 57: DESTROY_CLIENTID - Destroy existing | | 18.50. Operation 57: DESTROY_CLIENTID - Destroy existing | |
|
| client ID . . . . . . . . . . . . . . . . . . . . . . . 465 | | client ID . . . . . . . . . . . . . . . . . . . . . . . 472 | |
| 18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims | | 18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims | |
|
| Finished . . . . . . . . . . . . . . . . . . . . . . . . 466 | | Finished . . . . . . . . . . . . . . . . . . . . . . . . 473 | |
| 18.52. Operation 10044: ILLEGAL - Illegal operation . . . . . . 468 | | 18.52. Operation 10044: ILLEGAL - Illegal operation . . . . . . 475 | |
| 19. NFS version 4.1 Callback Procedures . . . . . . . . . . . . . 468 | | 19. NFS version 4.1 Callback Procedures . . . . . . . . . . . . . 476 | |
| 19.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 469 | | 19.1. Procedure 0: CB_NULL - No Operation . . . . . . . . . . 476 | |
| 19.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 469 | | 19.2. Procedure 1: CB_COMPOUND - Compound Operations . . . . . 476 | |
| 20. NFS version 4.1 Callback Operations . . . . . . . . . . . . . 471 | | 20. NFS version 4.1 Callback Operations . . . . . . . . . . . . . 478 | |
| 20.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . . . 471 | | 20.1. Operation 3: CB_GETATTR - Get Attributes . . . . . . . . 478 | |
| 20.2. Operation 4: CB_RECALL - Recall an Open Delegation . . . 473 | | 20.2. Operation 4: CB_RECALL - Recall an Open Delegation . . . 480 | |
| 20.3. Operation 5: CB_LAYOUTRECALL . . . . . . . . . . . . . . 474 | | 20.3. Operation 5: CB_LAYOUTRECALL . . . . . . . . . . . . . . 481 | |
| 20.4. Operation 6: CB_NOTIFY - Notify directory changes . . . 477 | | 20.4. Operation 6: CB_NOTIFY - Notify directory changes . . . 484 | |
| 20.5. Operation 7: CB_PUSH_DELEG . . . . . . . . . . . . . . . 480 | | 20.5. Operation 7: CB_PUSH_DELEG . . . . . . . . . . . . . . . 487 | |
| 20.6. Operation 8: CB_RECALL_ANY - Keep any N delegations . . 481 | | 20.6. Operation 8: CB_RECALL_ANY - Keep any N delegations . . 488 | |
| 20.7. Operation 9: CB_RECALLABLE_OBJ_AVAIL . . . . . . . . . . 484 | | 20.7. Operation 9: CB_RECALLABLE_OBJ_AVAIL . . . . . . . . . . 491 | |
| 20.8. Operation 10: CB_RECALL_SLOT - change flow control | | 20.8. Operation 10: CB_RECALL_SLOT - change flow control | |
|
| limits . . . . . . . . . . . . . . . . . . . . . . . . . 485 | | limits . . . . . . . . . . . . . . . . . . . . . . . . . 492 | |
| 20.9. Operation 11: CB_SEQUENCE - Supply backchannel | | 20.9. Operation 11: CB_SEQUENCE - Supply backchannel | |
|
| sequencing and control . . . . . . . . . . . . . . . . . 486 | | sequencing and control . . . . . . . . . . . . . . . . . 493 | |
| 20.10. Operation 12: CB_WANTS_CANCELLED . . . . . . . . . . . . 489 | | 20.10. Operation 12: CB_WANTS_CANCELLED . . . . . . . . . . . . 496 | |
| 20.11. Operation 13: CB_NOTIFY_LOCK - Notify of possible | | 20.11. Operation 13: CB_NOTIFY_LOCK - Notify of possible | |
|
| lock availability . . . . . . . . . . . . . . . . . . . 490 | | lock availability . . . . . . . . . . . . . . . . . . . 497 | |
| 20.12. Operation 10044: CB_ILLEGAL - Illegal Callback | | 20.12. Operation 10044: CB_ILLEGAL - Illegal Callback | |
|
| Operation . . . . . . . . . . . . . . . . . . . . . . . 491 | | Operation . . . . . . . . . . . . . . . . . . . . . . . 498 | |
| 21. Security Considerations . . . . . . . . . . . . . . . . . . . 492 | | 21. Security Considerations . . . . . . . . . . . . . . . . . . . 499 | |
| 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 492 | | 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 499 | |
| 22.1. Defining new layout types . . . . . . . . . . . . . . . 492 | | 22.1. Defining new layout types . . . . . . . . . . . . . . . 499 | |
| 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 493 | | 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 500 | |
| 23.1. Normative References . . . . . . . . . . . . . . . . . . 493 | | 23.1. Normative References . . . . . . . . . . . . . . . . . . 500 | |
| 23.2. Informative References . . . . . . . . . . . . . . . . . 494 | | 23.2. Informative References . . . . . . . . . . . . . . . . . 502 | |
| Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 496 | | Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 503 | |
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 497 | | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 504 | |
| Intellectual Property and Copyright Statements . . . . . . . . . 498 | | Intellectual Property and Copyright Statements . . . . . . . . . 506 | |
| | | | |
| 1. Introduction | | 1. Introduction | |
| | | | |
| 1.1. The NFSv4.1 Protocol | | 1.1. The NFSv4.1 Protocol | |
| | | | |
| The NFSv4.1 protocol is a minor version of the NFSv4 protocol | | The NFSv4.1 protocol is a minor version of the NFSv4 protocol | |
| described in [2]. It generally follows the guidelines for minor | | described in [2]. It generally follows the guidelines for minor | |
| versioning model laid in Section 10 of RFC 3530. However, it | | versioning model laid in Section 10 of RFC 3530. However, it | |
| diverges from guidelines 11 ("a client and server that supports minor | | diverges from guidelines 11 ("a client and server that supports minor | |
| version X must support minor versions 0 through X-1"), and 12 ("no | | version X must support minor versions 0 through X-1"), and 12 ("no | |
| | | | |
| skipping to change at page 15, line 22 | | skipping to change at page 15, line 22 | |
| the holder that inconsistent directory modifications cannot occur | | the holder that inconsistent directory modifications cannot occur | |
| so long as the delegation is held. | | so long as the delegation is held. | |
| | | | |
| o Layouts which are recallable objects that assure the holder that | | o Layouts which are recallable objects that assure the holder that | |
| direct access to the file data may be performed directly by the | | direct access to the file data may be performed directly by the | |
| client and that no change to the data's location inconsistent with | | client and that no change to the data's location inconsistent with | |
| that access may be made so long as the layout is held. | | that access may be made so long as the layout is held. | |
| | | | |
| All locks for a given client are tied together under a single client- | | All locks for a given client are tied together under a single client- | |
| wide lease. All requests made on sessions associated with the client | | wide lease. All requests made on sessions associated with the client | |
|
| renew that lease. When leases are not promptly renewed lock are | | renew that lease. When leases are not promptly renewed locks are | |
| subject to revocation. In the event of server reinitialization, | | subject to revocation. In the event of server re-initialization, | |
| clients have the opportunity to safely reclaim their locks within a | | clients have the opportunity to safely reclaim their locks within a | |
| special grace period. | | special grace period. | |
| | | | |
| 1.5. General Definitions | | 1.5. General Definitions | |
| | | | |
| The following definitions are provided for the purpose of providing | | The following definitions are provided for the purpose of providing | |
| an appropriate context for the reader. | | an appropriate context for the reader. | |
| | | | |
| Client The "client" is the entity that accesses the NFS server's | | Client The "client" is the entity that accesses the NFS server's | |
| resources. The client may be an application which contains the | | resources. The client may be an application which contains the | |
| logic to access the NFS server directly. The client may also be | | logic to access the NFS server directly. The client may also be | |
| the traditional operating system client remote file system | | the traditional operating system client remote file system | |
| services for a set of applications. | | services for a set of applications. | |
| | | | |
| A client is uniquely identified by a Client Owner. | | A client is uniquely identified by a Client Owner. | |
| | | | |
|
| In the case of file locking the client is the entity that | | With reference to file locking, the client is also the entity that | |
| maintains a set of locks on behalf of one or more applications. | | maintains a set of locks on behalf of one or more applications. | |
| This client is responsible for crash or failure recovery for those | | This client is responsible for crash or failure recovery for those | |
| locks it manages. | | locks it manages. | |
| | | | |
| Note that multiple clients may share the same transport and | | Note that multiple clients may share the same transport and | |
| connection and multiple clients may exist on the same network | | connection and multiple clients may exist on the same network | |
| node. | | node. | |
| | | | |
| Client ID A 64-bit quantity used as a unique, short-hand reference | | Client ID A 64-bit quantity used as a unique, short-hand reference | |
| to a client supplied Verifier and client owner. The server is | | to a client supplied Verifier and client owner. The server is | |
| responsible for supplying the client ID. | | responsible for supplying the client ID. | |
| | | | |
| Client Owner The client owner is a unique string, opaque to the | | Client Owner The client owner is a unique string, opaque to the | |
| server, which identifies a client. Multiple network connections | | server, which identifies a client. Multiple network connections | |
|
| and source network addresses originating those connections may | | and source network addresses originating from those connections | |
| share a client owner. The server is expected to treat requests | | may share a client owner. The server is expected to treat | |
| from connnections with the same client owner has coming from the | | requests from connnections with the same client owner as coming | |
| same client. | | from the same client. | |
| | | | |
| Lease An interval of time defined by the server for which the client | | Lease An interval of time defined by the server for which the client | |
| is irrevocably granted a lock. At the end of a lease period the | | is irrevocably granted a lock. At the end of a lease period the | |
| lock may be revoked if the lease has not been extended. The lock | | lock may be revoked if the lease has not been extended. The lock | |
| must be revoked if a conflicting lock has been granted after the | | must be revoked if a conflicting lock has been granted after the | |
| lease interval. | | lease interval. | |
| | | | |
| All leases granted by a server have the same fixed interval. Note | | All leases granted by a server have the same fixed interval. Note | |
| that the fixed interval was chosen to alleviate the expense a | | that the fixed interval was chosen to alleviate the expense a | |
| server would have in maintaining state about variable length | | server would have in maintaining state about variable length | |
| leases across server failures. | | leases across server failures. | |
| | | | |
|
| Lock The term "lock" is used to refer to any of record (octet-range) | | Lock The term "lock" is used to refer to record (octet-range) locks, | |
| locks, share reservations, delegations or layouts unless | | share reservations, delegations or layouts unless specifically | |
| specifically stated otherwise. | | stated otherwise. | |
| | | | |
| Server The "Server" is the entity responsible for coordinating | | Server The "Server" is the entity responsible for coordinating | |
|
| client access to a set of file systems. A server can span | | client access to a set of file systems and is identified by a | |
| multiple network addresses. In NFSv4.1, a server is a two tiered | | Server owner. A server can span multiple network addresses. | |
| entity allows for servers consisting of multiple components the | | | |
| flexibility to tightly or loosely couple their components without | | | |
| requiring tight synchronization among the components. Every | | | |
| server has a "Server Owner" which reflects the two tiers of a | | | |
| server entity. | | | |
| | | | |
| Server Owner The "Server Owner" identifies the server to the client. | | Server Owner The "Server Owner" identifies the server to the client. | |
| The server owner consists of a major and minor identifier. When | | The server owner consists of a major and minor identifier. When | |
| the client has two connections each to a peer with the same major | | the client has two connections each to a peer with the same major | |
|
| and minor identifier, the client assumes both peers are the same | | identifier, the client assumes both peers are the same server (the | |
| server (the server namespace is the same via each connection), and | | server namespace is the same via each connection), and assumes and | |
| further assumes session and lock state is sharable across both | | lock state is sharable across both connections. When each peer | |
| connections. When each peer has the same major identifier but | | both the same major and minor identifier, the client assumes each | |
| different minor identifier, the client assumes both peers can | | connection might be associatable with the same session. | |
| serve the same namespace, but session and lock state is not | | | |
| sharable across both connections. | | | |
| | | | |
| Stable Storage NFS version 4 servers must be able to recover without | | Stable Storage NFS version 4 servers must be able to recover without | |
| data loss from multiple power failures (including cascading power | | data loss from multiple power failures (including cascading power | |
| failures, that is, several power failures in quick succession), | | failures, that is, several power failures in quick succession), | |
| operating system failures, and hardware failure of components | | operating system failures, and hardware failure of components | |
| other than the storage medium itself (for example, disk, | | other than the storage medium itself (for example, disk, | |
| nonvolatile RAM). | | nonvolatile RAM). | |
| | | | |
| Some examples of stable storage that are allowable for an NFS | | Some examples of stable storage that are allowable for an NFS | |
| server include: | | server include: | |
| | | | |
| skipping to change at page 18, line 28 | | skipping to change at page 18, line 22 | |
| | | | |
| Previous NFS versions have been thought of as having a host-based | | Previous NFS versions have been thought of as having a host-based | |
| authentication model, where the NFS server authenticates the NFS | | authentication model, where the NFS server authenticates the NFS | |
| client, and trust the client to authenticate all users. Actually, | | client, and trust the client to authenticate all users. Actually, | |
| NFS has always depended on RPC for authentication. The first form of | | NFS has always depended on RPC for authentication. The first form of | |
| RPC authentication which required a host-based authentication | | RPC authentication which required a host-based authentication | |
| approach. NFSv4.1 also depends on RPC for basic security services, | | approach. NFSv4.1 also depends on RPC for basic security services, | |
| and mandates RPC support for a user-based authentication model. The | | and mandates RPC support for a user-based authentication model. The | |
| user-based authentication model has user principals authenticated by | | user-based authentication model has user principals authenticated by | |
| a server, and in turn the server authenticated by user principals. | | a server, and in turn the server authenticated by user principals. | |
|
| RPC provides some basic security services which are used by NFSv4. | | RPC provides some basic security services which are used by NFSv4.1. | |
| | | | |
| 2.2.1.1. RPC Security Flavors | | 2.2.1.1. RPC Security Flavors | |
| | | | |
| As described in section 7.2 "Authentication" of [4], RPC security is | | As described in section 7.2 "Authentication" of [4], RPC security is | |
| encapsulated in the RPC header, via a security or authentication | | encapsulated in the RPC header, via a security or authentication | |
| flavor, and information specific to the specification of the security | | flavor, and information specific to the specification of the security | |
| flavor. Every RPC header conveys information used to identify and | | flavor. Every RPC header conveys information used to identify and | |
| authenticate a client and server. As discussed in Section 2.2.1.1.1, | | authenticate a client and server. As discussed in Section 2.2.1.1.1, | |
| some security flavors provide additional security services. | | some security flavors provide additional security services. | |
| | | | |
| | | | |
| skipping to change at page 22, line 20 | | skipping to change at page 22, line 20 | |
| | | | |
| Except for a small number of operations needed for session creation, | | Except for a small number of operations needed for session creation, | |
| server requests and callback requests are performed within the | | server requests and callback requests are performed within the | |
| context of a session. Sessions provide a client context for every | | context of a session. Sessions provide a client context for every | |
| request and support robust reply protection for non-idempotent | | request and support robust reply protection for non-idempotent | |
| requests. | | requests. | |
| | | | |
| 2.4. Client Identifiers and Client Owners | | 2.4. Client Identifiers and Client Owners | |
| | | | |
| For each operation that obtains or depends on locking state, the | | For each operation that obtains or depends on locking state, the | |
|
| specific client must be determinable by the server. In NFSv4, each | | specific client must be identifiable by the server. | |
| distinct client instance is represented by a client ID, which is a | | | |
| 64-bit identifier that identifies a specific client at a given time | | | |
| and which is changed whenever the client re-initializes, and may | | | |
| change when the server re-initializes. Client IDs are used to | | | |
| support lock identification and crash recovery. | | | |
| | | | |
|
| In NFSv4.1, during steady state operation, the client ID associated | | Each distinct client instance is represented by a client ID. A | |
| with each operation is derived from the session (see Section 2.10) on | | client ID is a 64-bit identifier represents a specific client at a | |
| which the operation is issued. Each session is associated with a | | given time. The client ID is changed whenever the client re- | |
| specific client ID at session creation and that client ID then | | initializes, and may change when the server re-initializes. Client | |
| becomes the client ID associated with all requests issued using it. | | IDs are used to support lock identification and crash recovery. | |
| Therefore, unlike NFSv4.0, the only NFSv4.1 operations possible | | | |
| before a client ID is established are those needed to establish the | | During steady state operation, the client ID associated with each | |
| client ID. | | operation is derived from the session (see Section 2.10) on which the | |
| | | operation is issued. A session is associated with a client ID when | |
| | | the session is created. | |
| | | | |
| | | Unlike NFSv4.0, the only NFSv4.1 operations possible before a client | |
| | | ID is established are those needed to establish the client ID. | |
| | | | |
| A sequence of an EXCHANGE_ID operation followed by a CREATE_SESSION | | A sequence of an EXCHANGE_ID operation followed by a CREATE_SESSION | |
| operation using that client ID (eir_clientid as returned from | | operation using that client ID (eir_clientid as returned from | |
|
| EXCHANGE_ID) is required to establish the identification on the | | EXCHANGE_ID) is required to establish and confirm the client ID on | |
| server. Establishment of identification by a new incarnation of the | | the server. Establishment of identification by a new incarnation of | |
| client also has the effect of immediately releasing any locking state | | the client also has the effect of immediately releasing any locking | |
| that a previous incarnation of that same client might have had on the | | state that a previous incarnation of that same client might have had | |
| server. Such released state would include all lock, share | | on the server. Such released state would include all lock, share | |
| reservation, layout state, and where the server is not supporting the | | reservation, layout state, and where the server is not supporting the | |
| CLAIM_DELEGATE_PREV claim type, all delegation state associated with | | CLAIM_DELEGATE_PREV claim type, all delegation state associated with | |
|
| same client with the same identity. For discussion of delegation | | the same client with the same identity. For discussion of delegation | |
| state recovery, see Section 10.2.1. For discussion of layout state | | state recovery, see Section 10.2.1. For discussion of layout state | |
| recovery see Section 13.7.1. | | recovery see Section 13.7.1. | |
| | | | |
| Releasing such state requires that the server be able to determine | | Releasing such state requires that the server be able to determine | |
| that one client instance is the successor of another. Where this | | that one client instance is the successor of another. Where this | |
| cannot be done, for any of a number of reasons, the locking state | | cannot be done, for any of a number of reasons, the locking state | |
| will remain for a time subject to lease expiration (see Section 8.3) | | will remain for a time subject to lease expiration (see Section 8.3) | |
| and the new client will need to wait for such state to be removed, if | | and the new client will need to wait for such state to be removed, if | |
| it makes conflicting lock requests. | | it makes conflicting lock requests. | |
| | | | |
| Client identification is encapsulated in the following Client Owner | | Client identification is encapsulated in the following Client Owner | |
| structure: | | structure: | |
| | | | |
| struct client_owner4 { | | struct client_owner4 { | |
| verifier4 co_verifier; | | verifier4 co_verifier; | |
| opaque co_ownerid<NFS4_OPAQUE_LIMIT>; | | opaque co_ownerid<NFS4_OPAQUE_LIMIT>; | |
| }; | | }; | |
| | | | |
|
| The first field, co_verifier, is a client incarnation verifier that | | The first field, co_verifier, is a client incarnation verifier. The | |
| is used to detect client reboots. Only if the co_verifier is | | server will start the process of canceling the client's leased state | |
| different from that the server had previously recorded for the client | | if co_verifier is different than what the server has previously | |
| (as identified by the second field of the structure, co_ownerid) does | | recorded for the identified client (as specified in the co_ownerid | |
| the server start the process of canceling the client's leased state. | | field). | |
| | | | |
| The second field, co_ownerid is a variable length string that | | The second field, co_ownerid is a variable length string that | |
| uniquely defines the client so that subsequent instances of the same | | uniquely defines the client so that subsequent instances of the same | |
| client bear the same co_ownerid with a different verifier. | | client bear the same co_ownerid with a different verifier. | |
| | | | |
| There are several considerations for how the client generates the | | There are several considerations for how the client generates the | |
| co_ownerid string: | | co_ownerid string: | |
| | | | |
| o The string should be unique so that multiple clients do not | | o The string should be unique so that multiple clients do not | |
| present the same string. The consequences of two clients | | present the same string. The consequences of two clients | |
| presenting the same string range from one client getting an error | | presenting the same string range from one client getting an error | |
| to one client having its leased state abruptly and unexpectedly | | to one client having its leased state abruptly and unexpectedly | |
| canceled. | | canceled. | |
| | | | |
| o The string should be selected so the subsequent incarnations (e.g. | | o The string should be selected so the subsequent incarnations (e.g. | |
|
| reboots) of the same client cause the client to present the same | | restarts) of the same client cause the client to present the same | |
| string. The implementor is cautioned from an approach that | | string. The implementor is cautioned from an approach that | |
| requires the string to be recorded in a local file because this | | requires the string to be recorded in a local file because this | |
| precludes the use of the implementation in an environment where | | precludes the use of the implementation in an environment where | |
| there is no local disk and all file access is from an NFS version | | there is no local disk and all file access is from an NFS version | |
| 4 server. | | 4 server. | |
| | | | |
| o The string should be the same for each server network address that | | o The string should be the same for each server network address that | |
| the client accesses, (note: the precise opposite was advised in | | the client accesses, (note: the precise opposite was advised in | |
| the NFSv4.0 specification [2]). This way, if a server has | | the NFSv4.0 specification [2]). This way, if a server has | |
| multiple interfaces, the client can trunk traffic over multiple | | multiple interfaces, the client can trunk traffic over multiple | |
| | | | |
| skipping to change at page 24, line 43 | | skipping to change at page 24, line 43 | |
| * A true random number. However since this number ought to be | | * A true random number. However since this number ought to be | |
| the same between client incarnations, this shares the same | | the same between client incarnations, this shares the same | |
| problem as that of the using the timestamp of the software | | problem as that of the using the timestamp of the software | |
| installation. | | installation. | |
| | | | |
| o For a user level NFS version 4 client, it should contain | | o For a user level NFS version 4 client, it should contain | |
| additional information to distinguish the client from other user | | additional information to distinguish the client from other user | |
| level clients running on the same host, such as a process | | level clients running on the same host, such as a process | |
| identifier or other unique sequence. | | identifier or other unique sequence. | |
| | | | |
|
| A server may compare a client_owner4 in an EXCHANGE_ID with an | | The client ID is assigned by the server (the eir_clientid result from | |
| nfs_client_id4 established using SETCLIENTID using NFSv4 minor | | EXCHANGE_ID) and should be chosen so that it will not conflict with a | |
| version 0, so that an NFSv4.1 client is not forced to delay until | | client ID previously assigned by the server. This applies across | |
| lease expiration for locking state established by the earlier client | | server restarts. | |
| using minor version 0. This requires the client_owner4 be | | | |
| constructed the same way as the nfs_client_id4. If the latter's | | | |
| contents included the server's network address, and the NFSv4.1 | | | |
| client does not wish to use a client ID that prevents trunking, it | | | |
| should issue two EXCHANGE_ID operations. The first EXCHANGE_ID will | | | |
| have a client_owner4 equal to the nfs_client_id4. This will clear | | | |
| the state created by the NFSv4.0 client. The second EXCHANGE_ID will | | | |
| not have the server's network address. The state created for the | | | |
| second EXCHANGE_ID will not have to wait for lease expiration, | | | |
| because there will be no state to expire. | | | |
| | | | |
| Once an EXCHANGE_ID has been done, and the resulting client ID | | | |
| established as associated with a session, all requests made on that | | | |
| session implicitly identify that client ID, which in turn designates | | | |
| the client specified using the long-form client_owner4 structure. | | | |
| The shorthand client identifier (a client ID) is assigned by the | | | |
| server (the eir_clientid result from EXCHANGE_ID) and should be | | | |
| chosen so that it will not conflict with a client ID previously | | | |
| assigned by the server. This applies across server restarts or | | | |
| reboots. | | | |
| | | | |
| In the event of a server restart, a client may find out that its | | In the event of a server restart, a client may find out that its | |
|
| current client ID is no longer valid when receives a | | current client ID is no longer valid when it receives a | |
| NFS4ERR_STALE_CLIENTID error. The precise circumstances depend of | | NFS4ERR_STALE_CLIENTID error. The precise circumstances depend on | |
| the characteristics of the sessions involved, specifically whether | | the characteristics of the sessions involved, specifically whether | |
| the session is persistent (see Section 2.10.5.5). | | the session is persistent (see Section 2.10.5.5). | |
| | | | |
| When a session is not persistent, the client will need to create a | | When a session is not persistent, the client will need to create a | |
| new session. When the existing client ID is presented to a server as | | new session. When the existing client ID is presented to a server as | |
| part of creating a session and that client ID is not recognized, as | | part of creating a session and that client ID is not recognized, as | |
|
| would happen after a server reboot, the server will reject the | | would happen after a server restart, the server will reject the | |
| request with the error NFS4ERR_STALE_CLIENTID. When this happens, | | request with the error NFS4ERR_STALE_CLIENTID. When this happens, | |
| the client must obtain a new client ID by use of the EXCHANGE_ID | | the client must obtain a new client ID by use of the EXCHANGE_ID | |
|
| operation and then use that client ID as the basis of the basis of a | | operation, then use that client ID as the basis of a new session, and | |
| new session and then proceed to any other necessary recovery for the | | then proceed to any other necessary recovery for the server restart | |
| server reboot case (See Section 8.4.2). | | case (See Section 8.4.2). | |
| | | | |
| In the case of the session being persistent, the client will re- | | In the case of the session being persistent, the client will re- | |
|
| establish communication using the existing session after the reboot. | | establish communication using the existing session after the restart. | |
| This session will be associated with a client ID that has had state | | This session will be associated with a client ID that has had state | |
| revoked (but the persistent session is never associated with a stale | | revoked (but the persistent session is never associated with a stale | |
| client ID, because if the session is persistent, the client ID MUST | | client ID, because if the session is persistent, the client ID MUST | |
|
| persist), and the client will receive an indication of that fact in | | persist), and the client will receive an indication of that fact via | |
| the sr_status_flags field returned by the SEQUENCE operation (see | | the SEQ4_STAUS_RESTART_RECLAIM_NEEDED flag returned in the | |
| Section 18.46.4). The client can then use the existing session to do | | sr_status_flags field the SEQUENCE operation (see Section 18.46.4). | |
| whatever operations are necessary to determine the status of requests | | The client can then use the existing session to do whatever | |
| outstanding at the time of reboot, while avoiding issuing new | | operations are necessary to determine the status of requests | |
| | | outstanding at the time of restart, while avoiding issuing new | |
| requests, particularly any involving locking on that session. Such | | requests, particularly any involving locking on that session. Such | |
| requests would fail with an NFS4ERR_STALE_STATEID error, if | | requests would fail with an NFS4ERR_STALE_STATEID error, if | |
| attempted. | | attempted. | |
| | | | |
| See the detailed descriptions of EXCHANGE_ID (Section 18.35 and | | See the detailed descriptions of EXCHANGE_ID (Section 18.35 and | |
| CREATE_SESSION (Section 18.36) for a complete specification of these | | CREATE_SESSION (Section 18.36) for a complete specification of these | |
| operations. | | operations. | |
| | | | |
|
| 2.4.1. Server Release of Client ID | | 2.4.1. Upgrade from NFSv4.0 to NFSv4.1 | |
| | | | |
| | | To facilitate upgrade from NFSv4.0 to NFSv4.1, a server may compare a | |
| | | client_owner4 in an EXCHANGE_ID with an nfs_client_id4 established | |
| | | using SETCLIENTID using NFSv4.0, so that an NFSv4.1 client is not | |
| | | forced to delay until lease expiration for locking state established | |
| | | by the earlier client using minor version 0. This requires the | |
| | | client_owner4 be constructed the same way as the nfs_client_id4. If | |
| | | the latter's contents included the server's network address, and the | |
| | | NFSv4.1 client does not wish to use a client ID that prevents | |
| | | trunking, it should issue two EXCHANGE_ID operations. The first | |
| | | EXCHANGE_ID will have a client_owner4 equal to the nfs_client_id4. | |
| | | This will clear the state created by the NFSv4.0 client. The second | |
| | | EXCHANGE_ID will not have the server's network address. The state | |
| | | created for the second EXCHANGE_ID will not have to wait for lease | |
| | | expiration, because there will be no state to expire. | |
| | | | |
| | | 2.4.2. Server Release of Client ID | |
| | | | |
| NFSv4.1 introduces a new operation called DESTROY_CLIENTID | | NFSv4.1 introduces a new operation called DESTROY_CLIENTID | |
| (Section 18.50) which the client SHOULD use to destroy a client ID it | | (Section 18.50) which the client SHOULD use to destroy a client ID it | |
| no longer needs. This permits graceful, bilateral release of a | | no longer needs. This permits graceful, bilateral release of a | |
|
| client ID. | | client ID. The operation cannot be used if there are sessions | |
| | | associated with the client ID, or state with an unexpired lease. | |
| | | | |
| If the server determines that the client holds no associated state | | If the server determines that the client holds no associated state | |
| for its client ID (including sessions, opens, locks, delegations, | | for its client ID (including sessions, opens, locks, delegations, | |
| layouts, and wants), the server may choose to unilaterally release | | layouts, and wants), the server may choose to unilaterally release | |
| the client ID. The server may make this choice for an inactive | | the client ID. The server may make this choice for an inactive | |
| client so that resources are not consumed by those intermittently | | client so that resources are not consumed by those intermittently | |
| active clients. If the client contacts the server after this | | active clients. If the client contacts the server after this | |
| release, the server must ensure the client receives the appropriate | | release, the server must ensure the client receives the appropriate | |
| error so that it will use the EXCHANGE_ID/CREATE_SESSION sequence to | | error so that it will use the EXCHANGE_ID/CREATE_SESSION sequence to | |
| establish a new identity. It should be clear that the server must be | | establish a new identity. It should be clear that the server must be | |
| very hesitant to release a client ID since the resulting work on the | | very hesitant to release a client ID since the resulting work on the | |
| client to recover from such an event will be the same burden as if | | client to recover from such an event will be the same burden as if | |
| the server had failed and restarted. Typically a server would not | | the server had failed and restarted. Typically a server would not | |
| release a client ID unless there had been no activity from that | | release a client ID unless there had been no activity from that | |
| client for many minutes. As long as there are sessions, opens, | | client for many minutes. As long as there are sessions, opens, | |
| locks, delegations, layouts, or wants, the server MUST not release | | locks, delegations, layouts, or wants, the server MUST not release | |
|
| the client ID. See Section 2.10.9.1.4 for discussion on releasing | | the client ID. See Section 2.10.10.1.4 for discussion on releasing | |
| inactive sessions. | | inactive sessions. | |
| | | | |
|
| 2.4.2. Resolving Client Owner Conflicts | | 2.4.3. Resolving Client Owner Conflicts | |
| | | | |
| When the server gets an EXCHANGE_ID for a client owner that currently | | When the server gets an EXCHANGE_ID for a client owner that currently | |
|
| has no state, or if it has state, but the lease has expired, server | | has no state, or if it has state, but the lease has expired, the | |
| MUST allow the EXCHANGE_ID, and confirm the new client ID if followed | | server MUST allow the EXCHANGE_ID, and confirm the new client ID if | |
| by the appropriate CREATE_SESSION. | | followed by the appropriate CREATE_SESSION. | |
| | | | |
|
| When the server gets an EXCHANGE_ID for a client owner that currently | | When the server gets an EXCHANGE_ID for a new incarnation of a client | |
| has state and an unexpired lease, the server MUST NOT destroy any | | owner that currently has an old incarnation with state and an | |
| state that currently exists for the client owner unless one of the | | unexpired lease, the server is allowed to dispose of the state of the | |
| following are true: | | previous incarnation of the client owner if one of the following are | |
| | | true: | |
| | | | |
| o The principal that created the client ID for the client owner is | | o The principal that created the client ID for the client owner is | |
| the same as the principal that is issuing the EXCHANGE_ID. Note | | the same as the principal that is issuing the EXCHANGE_ID. Note | |
| that if the client ID was created with SP4_MACH_CRED protection | | that if the client ID was created with SP4_MACH_CRED protection | |
| (Section 18.35), the principal MUST be based on RPCSEC_GSS | | (Section 18.35), the principal MUST be based on RPCSEC_GSS | |
| authentication, the RPCSEC_GSS service used MUST be integrity or | | authentication, the RPCSEC_GSS service used MUST be integrity or | |
| privacy, and the same GSS mechanism and principal must be used as | | privacy, and the same GSS mechanism and principal must be used as | |
| that used when the client ID was created. | | that used when the client ID was created. | |
| | | | |
| o The client ID was established with SP4_SSV protection | | o The client ID was established with SP4_SSV protection | |
|
| (Section 18.35), and the client sends the EXCHANGE_ID with the | | (Section 18.35, Section 2.10.7.3) and the client sends the | |
| security flavor set to RPCSEC_GSS using the GSS SSV mechanism | | EXCHANGE_ID with the security flavor set to RPCSEC_GSS using the | |
| (Section 2.10.7.4). Note that this is possible only if the server | | GSS SSV mechanism (Section 2.10.8). | |
| and client persist the SSV. | | | |
| | | | |
| o The client ID was established with SP4_SSV protection. Because | | o The client ID was established with SP4_SSV protection. Because | |
| the SSV might not be persisted across client and server restart, | | the SSV might not be persisted across client and server restart, | |
| and because the first time a client issues EXCHANGE_ID to a server | | and because the first time a client issues EXCHANGE_ID to a server | |
| it does not have an SSV, the client MAY issue the subsequent | | it does not have an SSV, the client MAY issue the subsequent | |
| EXCHANGE_ID without an SSV RPCSEC_GSS handle. Instead, as with | | EXCHANGE_ID without an SSV RPCSEC_GSS handle. Instead, as with | |
| SP4_MACH_CRED protection, the principal MUST be based on | | SP4_MACH_CRED protection, the principal MUST be based on | |
| RPCSEC_GSS authentication, the RPCSEC_GSS service used MUST be | | RPCSEC_GSS authentication, the RPCSEC_GSS service used MUST be | |
| integrity or privacy, and the same GSS mechanism and principal | | integrity or privacy, and the same GSS mechanism and principal | |
| must be used as that used when the client ID was created. | | must be used as that used when the client ID was created. | |
| | | | |
|
| If the none of the above situations apply, the server MUST return | | If none of the above situations apply, the server MUST return | |
| NFS4ERR_CLID_INUSE. | | NFS4ERR_CLID_INUSE. | |
| | | | |
|
| Even the server accepts the principal and co_ownerid as matching that | | If the server accepts the principal and co_ownerid as matching that | |
| which created the client ID, it MUST NOT delete any state unless the | | which created the client ID, it deletes state (upon a a | |
| co_verifier in the EXCHANGE_ID does not match the co_verifier used | | CREATE_SESSION confirming the client id) if the co_verifier in the | |
| when client ID was created. If the co_verifier matches, then the | | EXCHANGE_ID differs from the co_verifier used when the client ID was | |
| client is either updating properties of the client ID, or possibly | | created. If the co_verifier values are the same, then the client is | |
| attempting trunking opportunity (Section 2.10.4). | | either updating properties of the client ID (Section 18.35), or | |
| | | possibly attempting trunking (Section 2.10.4) and the server MUST NOT | |
| | | delete state. | |
| | | | |
| 2.5. Server Owners | | 2.5. Server Owners | |
| | | | |
| The Server Owner is somewhat similar to a Client Owner (Section 2.4), | | The Server Owner is somewhat similar to a Client Owner (Section 2.4), | |
| but unlike the Client Owner, there is no shorthand serverid. The | | but unlike the Client Owner, there is no shorthand serverid. The | |
| Server Owner is defined in the following structure: | | Server Owner is defined in the following structure: | |
| | | | |
| struct server_owner4 { | | struct server_owner4 { | |
| uint64_t so_minor_id; | | uint64_t so_minor_id; | |
| opaque so_major_id<NFS4_OPAQUE_LIMIT>; | | opaque so_major_id<NFS4_OPAQUE_LIMIT>; | |
| | | | |
| skipping to change at page 42, line 18 | | skipping to change at page 42, line 23 | |
| different EXCHANGE_ID requests, and the eir_clientid, | | different EXCHANGE_ID requests, and the eir_clientid, | |
| eir_server_owner.so_major_id, eir_server_owner.so_minor_id, and | | eir_server_owner.so_major_id, eir_server_owner.so_minor_id, and | |
| eir_server_scope results match in both EXCHANGE_ID results, then | | eir_server_scope results match in both EXCHANGE_ID results, then | |
| the client is permitted to perform session trunking. If the | | the client is permitted to perform session trunking. If the | |
| client has no session mapping to the tuple of eir_clientid, | | client has no session mapping to the tuple of eir_clientid, | |
| eir_server_owner.so_major_id, eir_server_scope, | | eir_server_owner.so_major_id, eir_server_scope, | |
| eir_server_owner.so_minor_id, then it creates the session via a | | eir_server_owner.so_minor_id, then it creates the session via a | |
| CREATE_SESSION operation over one of the connections, which | | CREATE_SESSION operation over one of the connections, which | |
| associates the connection to the session. If there is a session | | associates the connection to the session. If there is a session | |
| for the tuple, the client can issue BIND_CONN_TO_SESSION to | | for the tuple, the client can issue BIND_CONN_TO_SESSION to | |
|
| associate the connection to the session. The client can invoke | | associate the connection to the session. Or if the client does | |
| CREATE_SESSION regardless whether there is session for the tuple. | | not want to use session trunking, it can invoke CREATE_SESSION on | |
| The second connection is associated with the same session as the | | the connection. | |
| first connection via the BIND_CONN_TO_SESSION operation. | | | |
| | | | |
| Client ID Trunking If the eia_clientowner argument is the same in | | Client ID Trunking If the eia_clientowner argument is the same in | |
| two different EXCHANGE_ID requests, and the eir_clientid, | | two different EXCHANGE_ID requests, and the eir_clientid, | |
| eir_server_owner.so_major_id, and eir_server_scope results match | | eir_server_owner.so_major_id, and eir_server_scope results match | |
| in both EXCHANGE_ID results, but the eir_server_owner.so_minor_id | | in both EXCHANGE_ID results, but the eir_server_owner.so_minor_id | |
| results do not match then the client is permitted to perform | | results do not match then the client is permitted to perform | |
| client ID trunking. The client can associate each connection with | | client ID trunking. The client can associate each connection with | |
| different sessions, where each session is associated with the same | | different sessions, where each session is associated with the same | |
| server. Of course, even if the eir_server_owner.so_minor_id | | server. Of course, even if the eir_server_owner.so_minor_id | |
| fields do match, the client is free to employ client ID trunking | | fields do match, the client is free to employ client ID trunking | |
| | | | |
| skipping to change at page 43, line 18 | | skipping to change at page 43, line 22 | |
| SP4_MACH_CRED (Section 18.35) state protection options. For | | SP4_MACH_CRED (Section 18.35) state protection options. For | |
| SP4_SSV, reliable verification depends on a shared secret (the | | SP4_SSV, reliable verification depends on a shared secret (the | |
| SSV) that is established via the SET_SSV (Section 18.47) | | SSV) that is established via the SET_SSV (Section 18.47) | |
| operation. | | operation. | |
| | | | |
| When a new connection is associated with the session (via the | | When a new connection is associated with the session (via the | |
| BIND_CONN_TO_SESSION operation, see Section 18.34), if the client | | BIND_CONN_TO_SESSION operation, see Section 18.34), if the client | |
| specified SP4_SSV state protection for the BIND_CONN_TO_SESSION | | specified SP4_SSV state protection for the BIND_CONN_TO_SESSION | |
| operation, the client MUST issue the BIND_CONN_TO_SESSION with | | operation, the client MUST issue the BIND_CONN_TO_SESSION with | |
| RPCSEC_GSS protection, using integrity or privacy, and a | | RPCSEC_GSS protection, using integrity or privacy, and a | |
|
| RPCSEC_GSS using the GSS SSV mechanism (Section 2.10.7.4). The | | RPCSEC_GSS using the GSS SSV mechanism (Section 2.10.8). The | |
| RPCSEC_GSS handle is created by CREATE_SESSION (Section 18.36). | | RPCSEC_GSS handle is created by CREATE_SESSION (Section 18.36). | |
| | | | |
| If the client mistakenly tries to associate a connection to a | | If the client mistakenly tries to associate a connection to a | |
| session of a wrong server, the server will either reject the | | session of a wrong server, the server will either reject the | |
| attempt because it is not aware of the session identifier of the | | attempt because it is not aware of the session identifier of the | |
| BIND_CONN_TO_SESSION arguments, or it will reject the attempt | | BIND_CONN_TO_SESSION arguments, or it will reject the attempt | |
| because the RPCSEC_GSS authentication fails. Even if the server | | because the RPCSEC_GSS authentication fails. Even if the server | |
| mistakenly or maliciously accepts the connection association | | mistakenly or maliciously accepts the connection association | |
| attempt, the RPCSEC_GSS verifier it computes in the response will | | attempt, the RPCSEC_GSS verifier it computes in the response will | |
| not be verified by the client, the client will know it cannot use | | not be verified by the client, the client will know it cannot use | |
| | | | |
| skipping to change at page 44, line 5 | | skipping to change at page 44, line 9 | |
| EXCHANGE_ID. Each time an EXCHANGE_ID is issued with RPCSEC_GSS | | EXCHANGE_ID. Each time an EXCHANGE_ID is issued with RPCSEC_GSS | |
| authentication, the client notes the principal name of the GSS | | authentication, the client notes the principal name of the GSS | |
| target. If the EXCHANGE_ID results indicate client ID trunking is | | target. If the EXCHANGE_ID results indicate client ID trunking is | |
| possible, and the GSS targets' principal names are the same, the | | possible, and the GSS targets' principal names are the same, the | |
| servers are the same and client ID trunking is allowed. | | servers are the same and client ID trunking is allowed. | |
| | | | |
| The second option for verification is to use SP4_SSV protection. | | The second option for verification is to use SP4_SSV protection. | |
| When the client issues EXCHANGE_ID it specifies SP4_SSV | | When the client issues EXCHANGE_ID it specifies SP4_SSV | |
| protection. The first EXCHANGE_ID the client issues always has to | | protection. The first EXCHANGE_ID the client issues always has to | |
| be confirmed by a CREATE_SESSION call. The client then issues | | be confirmed by a CREATE_SESSION call. The client then issues | |
|
| SET_SSV on the sessions. Later the client issues EXCHANGE_ID to a | | SET_SSV. Later the client issues EXCHANGE_ID to a second | |
| second destination network address than the first EXCHANGE_ID was | | destination network address than the first EXCHANGE_ID was issued | |
| issued with. The client checks that each EXCHANGE_ID reply has | | with. The client checks that each EXCHANGE_ID reply has the same | |
| the same eir_clientid, eir_server_owner.so_major_id, and | | eir_clientid, eir_server_owner.so_major_id, and eir_server_scope. | |
| eir_server_scope. If so, the client verifies the claim by issuing | | If so, the client verifies the claim by issuing a CREATE_SESSION | |
| a CREATE_SESSION to the second destination address, protected with | | to the second destination address, protected with RPCSEC_GSS | |
| RPCSEC_GSS integrity using an RPCSEC_GSS handle returned by the | | integrity using an RPCSEC_GSS handle returned by the second | |
| second EXCHANGE_ID. If the server accept the CREATE_SESSION | | EXCHANGE_ID. If the server accept the CREATE_SESSION request, and | |
| request, and if the client verifies the RPCSEC_GSS verifier and | | if the client verifies the RPCSEC_GSS verifier and integrity | |
| integrity codes, then the client has proof the second server knows | | codes, then the client has proof the second server knows the SSV, | |
| the SSV, and thus the two servers are the same for the purposes of | | and thus the two servers are the same for the purposes of client | |
| client ID trunking. | | ID trunking. | |
| | | | |
| 2.10.5. Exactly Once Semantics | | 2.10.5. Exactly Once Semantics | |
| | | | |
| Via the session, NFSv4.1 offers exactly once semantics (EOS) for | | Via the session, NFSv4.1 offers exactly once semantics (EOS) for | |
| requests sent over a channel. EOS is supported on both the fore and | | requests sent over a channel. EOS is supported on both the fore and | |
| back channels. | | back channels. | |
| | | | |
| Each COMPOUND or CB_COMPOUND request that is issued with a leading | | Each COMPOUND or CB_COMPOUND request that is issued with a leading | |
| SEQUENCE or CB_SEQUENCE operation MUST be executed by the receiver | | SEQUENCE or CB_SEQUENCE operation MUST be executed by the receiver | |
| exactly once. This requirement is regardless whether the request is | | exactly once. This requirement is regardless whether the request is | |
| | | | |
| skipping to change at page 48, line 4 | | skipping to change at page 48, line 8 | |
| request), nonetheless there are considerations for the XID in NFSv4.1 | | request), nonetheless there are considerations for the XID in NFSv4.1 | |
| that are the same as all other previous versions of NFS. The RPC XID | | that are the same as all other previous versions of NFS. The RPC XID | |
| remains in each message and must be formulated in NFSv4.1 requests as | | remains in each message and must be formulated in NFSv4.1 requests as | |
| it any other ONC RPC request. The reasons include: | | it any other ONC RPC request. The reasons include: | |
| | | | |
| o The RPC layer retains its existing semantics and implementation. | | o The RPC layer retains its existing semantics and implementation. | |
| | | | |
| o The requester and replier must be able to interoperate at the RPC | | o The requester and replier must be able to interoperate at the RPC | |
| layer, prior to the NFSv4.1 decoding of the SEQUENCE or | | layer, prior to the NFSv4.1 decoding of the SEQUENCE or | |
| CB_SEQUENCE operation | | CB_SEQUENCE operation | |
|
| | | | |
| o If an operation is being used that does not start with SEQUENCE or | | o If an operation is being used that does not start with SEQUENCE or | |
| CB_SEQUENCE (e.g. BIND_CONN_TO_SESSION), then the RPC XID is | | CB_SEQUENCE (e.g. BIND_CONN_TO_SESSION), then the RPC XID is | |
| needed for correct operation to match the reply to the request. | | needed for correct operation to match the reply to the request. | |
| | | | |
| o The SEQUENCE or CB_SEQUENCE operation may generate an error. If | | o The SEQUENCE or CB_SEQUENCE operation may generate an error. If | |
| so, the embedded slot id, sequence id, and sessionid (if present) | | so, the embedded slot id, sequence id, and sessionid (if present) | |
| in the request will not be in the reply, and the requester has | | in the request will not be in the reply, and the requester has | |
|
| only the XID to to match the reply to the request. | | only the XID to match the reply to the request. | |
| | | | |
| Givem that well formulated XIDs continue to be required, this begs | | Givem that well formulated XIDs continue to be required, this begs | |
| the question why SEQUENCE and CB_SEQUENCE replies have a sessionid, | | the question why SEQUENCE and CB_SEQUENCE replies have a sessionid, | |
| slot id and sequence id? Having the sessionid in the reply means the | | slot id and sequence id? Having the sessionid in the reply means the | |
| requester does not have to use the XID to lookup the sessionid, which | | requester does not have to use the XID to lookup the sessionid, which | |
| would be necessary if the connection were associated with multiple | | would be necessary if the connection were associated with multiple | |
| sessions. Having the slot id and sequence id in the reply means | | sessions. Having the slot id and sequence id in the reply means | |
| requester does not have to use the XID to lookup the slot id and | | requester does not have to use the XID to lookup the slot id and | |
| sequence id. Furhermore, since the XID is only 32 bits, it is too | | sequence id. Furhermore, since the XID is only 32 bits, it is too | |
| small to guarantee the re-association of a reply with its request | | small to guarantee the re-association of a reply with its request | |
| | | | |
| skipping to change at page 59, line 38 | | skipping to change at page 59, line 38 | |
| the client need not provide a target GSS principal for the | | the client need not provide a target GSS principal for the | |
| backchannel as it did with NFSv4.0, nor the server have to implement | | backchannel as it did with NFSv4.0, nor the server have to implement | |
| an RPCSEC_GSS initiator as it did with NFSv4.0 [2]. | | an RPCSEC_GSS initiator as it did with NFSv4.0 [2]. | |
| | | | |
| The CREATE_SESSION (Section 18.36) and BACKCHANNEL_CTL | | The CREATE_SESSION (Section 18.36) and BACKCHANNEL_CTL | |
| (Section 18.33) operations allow the client to specify flavor/ | | (Section 18.33) operations allow the client to specify flavor/ | |
| principal combinations. | | principal combinations. | |
| | | | |
| Also note that the SP4_SSV state protection mode (see Section 18.35 | | Also note that the SP4_SSV state protection mode (see Section 18.35 | |
| and Section 2.10.7.3) has the side benefit of providing SSV-derived | | and Section 2.10.7.3) has the side benefit of providing SSV-derived | |
|
| RPCSEC_GSS contexts (Section 2.10.7.4). | | RPCSEC_GSS contexts (Section 2.10.8). | |
| | | | |
| 2.10.7.3. Protection from Unauthorized State Changes | | 2.10.7.3. Protection from Unauthorized State Changes | |
| | | | |
| As described to this point in the specification, the state model of | | As described to this point in the specification, the state model of | |
| NFSv4.1 is vulnerable to an attacker that issues a SEQUENCE operation | | NFSv4.1 is vulnerable to an attacker that issues a SEQUENCE operation | |
| with a forged sessionid and with a slot id that it expects the | | with a forged sessionid and with a slot id that it expects the | |
| legitimate client to use next. When the legitimate client uses the | | legitimate client to use next. When the legitimate client uses the | |
| slot id with the same sequence number, the server returns the | | slot id with the same sequence number, the server returns the | |
| attacker's result from the reply cache which disrupts the legitimate | | attacker's result from the reply cache which disrupts the legitimate | |
| client and thus denies service to it. Similarly an attacker could | | client and thus denies service to it. Similarly an attacker could | |
| | | | |
| skipping to change at page 61, line 31 | | skipping to change at page 61, line 31 | |
| 3. The physical client has multiple users, but the client | | 3. The physical client has multiple users, but the client | |
| implementation has a unique client ID for each user. This is | | implementation has a unique client ID for each user. This is | |
| effectively the same as the second scenario, but a disadvantage | | effectively the same as the second scenario, but a disadvantage | |
| is that each user must be allocated at least one session each, so | | is that each user must be allocated at least one session each, so | |
| the approach suffers from lack of economy. | | the approach suffers from lack of economy. | |
| | | | |
| The SP4_SSV protection option uses a Secret State Verifier (SSV) | | The SP4_SSV protection option uses a Secret State Verifier (SSV) | |
| which is shared between a client and server. The SSV serves as the | | which is shared between a client and server. The SSV serves as the | |
| secret key for an internal (that is, internal to NFSv4.1) GSS | | secret key for an internal (that is, internal to NFSv4.1) GSS | |
| mechanism that uses the secret key for Message Integrity Code (MIC) | | mechanism that uses the secret key for Message Integrity Code (MIC) | |
|
| and Wrap tokens (Section 2.10.7.4). The SP4_SSV protection option is | | and Wrap tokens (Section 2.10.8). The SP4_SSV protection option is | |
| intended for the client that has multiple users, and the system | | intended for the client that has multiple users, and the system | |
| administrator does not wish to configure a permanent machine | | administrator does not wish to configure a permanent machine | |
| credential for each client. The SSV is established on the server via | | credential for each client. The SSV is established on the server via | |
| SET_SSV (see Section 18.47). To prevent eavesdropping, a client | | SET_SSV (see Section 18.47). To prevent eavesdropping, a client | |
| SHOULD issue SET_SSV via RPCSEC_GSS with the privacy service. | | SHOULD issue SET_SSV via RPCSEC_GSS with the privacy service. | |
| Several aspects of the SSV make it intractable for an attacker to | | Several aspects of the SSV make it intractable for an attacker to | |
| guess the SSV, and thus associate rogue connections with a session, | | guess the SSV, and thus associate rogue connections with a session, | |
| and rogue sessions with a client ID: | | and rogue sessions with a client ID: | |
| | | | |
| o The arguments to and results of SET_SSV include digests of the old | | o The arguments to and results of SET_SSV include digests of the old | |
| and new SSV, respectively. | | and new SSV, respectively. | |
| | | | |
| o Because the initial value of the SSV is zero, therefore known, the | | o Because the initial value of the SSV is zero, therefore known, the | |
| client that opts for SP4_SSV protection and opts to apply SP4_SSV | | client that opts for SP4_SSV protection and opts to apply SP4_SSV | |
| protection to BIND_CONN_TO_SESSION and CREATE_SESSION MUST issue | | protection to BIND_CONN_TO_SESSION and CREATE_SESSION MUST issue | |
| at least one SET_SSV operation before the first | | at least one SET_SSV operation before the first | |
| BIND_CONN_TO_SESSION operation or before the second CREATE_SESSION | | BIND_CONN_TO_SESSION operation or before the second CREATE_SESSION | |
| operation on a client ID. If it does not, the SSV mechanism will | | operation on a client ID. If it does not, the SSV mechanism will | |
|
| not generate tokens (Section 2.10.7.4). A client SHOULD issue | | not generate tokens (Section 2.10.8). A client SHOULD issue | |
| SET_SSV as soon as a session is created. | | SET_SSV as soon as a session is created. | |
| | | | |
| o A SET_SSV does not replace the SSV with the argument to SET_SSV. | | o A SET_SSV does not replace the SSV with the argument to SET_SSV. | |
| Instead, the current SSV on the server is logically exclusive ORed | | Instead, the current SSV on the server is logically exclusive ORed | |
|
| (XORed) with the argument to SET_SSV. SET_SSV MUST NOT be called | | (XORed) with the argument to SET_SSV. Each time a new principal | |
| with an SSV value that is zero. For this reason, each time a new | | uses a client ID for the first time, the client SHOULD issue a | |
| principal uses a client ID for the first time, the client SHOULD | | SET_SSV with that principal's RPCSEC_GSS credentials, with | |
| issue a SET_SSV with that principal's RPCSEC_GSS credentials, with | | | |
| RPCSEC_GSS service set to RPC_GSS_SVC_PRIVACY. | | RPCSEC_GSS service set to RPC_GSS_SVC_PRIVACY. | |
| | | | |
| Here are the types of attacks that can be attempted by an attacker | | Here are the types of attacks that can be attempted by an attacker | |
| named Eve on a victim named Bob, and how SP4_SSV protection foils | | named Eve on a victim named Bob, and how SP4_SSV protection foils | |
| each attack: | | each attack: | |
| | | | |
| o Suppose Eve is the first user to log into a legitimate client. | | o Suppose Eve is the first user to log into a legitimate client. | |
| Eve's use of an NFSv4.1 file system will cause an SSV to be | | Eve's use of an NFSv4.1 file system will cause an SSV to be | |
| created via the legitimate client's NFSv4.1 implementation. The | | created via the legitimate client's NFSv4.1 implementation. The | |
| SET_SSV that creates the SSV will be protected by the RPCSEC_GSS | | SET_SSV that creates the SSV will be protected by the RPCSEC_GSS | |
| | | | |
| skipping to change at page 64, line 21 | | skipping to change at page 64, line 19 | |
| If the goal of a counter threat strategy is to prevent a connection | | If the goal of a counter threat strategy is to prevent a connection | |
| hijacker from making unauthorized state changes, then the | | hijacker from making unauthorized state changes, then the | |
| SP4_MACH_CRED protection approach can be used with a client ID per | | SP4_MACH_CRED protection approach can be used with a client ID per | |
| user (i.e. the aforementioned third scenario for machine credential | | user (i.e. the aforementioned third scenario for machine credential | |
| state protection). Each EXCHANGE_ID can specify the all operations | | state protection). Each EXCHANGE_ID can specify the all operations | |
| MUST be protected with the machine credential. The server will then | | MUST be protected with the machine credential. The server will then | |
| reject any subsequent operations on the client ID that do not use | | reject any subsequent operations on the client ID that do not use | |
| RPCSEC_GSS with privacy or integrity and do not use the same | | RPCSEC_GSS with privacy or integrity and do not use the same | |
| credential that created the client ID. | | credential that created the client ID. | |
| | | | |
|
| 2.10.7.4. The SSV GSS Mechanism | | 2.10.8. The SSV GSS Mechanism | |
| | | | |
| The SSV provides the secret key for a mechanism that NFSv4.1 uses for | | The SSV provides the secret key for a mechanism that NFSv4.1 uses for | |
| state protection. Contexts for this mechanism are not established | | state protection. Contexts for this mechanism are not established | |
| via the RPCSEC_GSS protocol. Instead, the contexts are automatically | | via the RPCSEC_GSS protocol. Instead, the contexts are automatically | |
| created when EXCHANGE_ID specifies SP4_SSV protection. The only | | created when EXCHANGE_ID specifies SP4_SSV protection. The only | |
| tokens defined are the PerMsgToken (emitted by GSS_GetMIC) and the | | tokens defined are the PerMsgToken (emitted by GSS_GetMIC) and the | |
| SealedMessage (emitted by GSS_Wrap). | | SealedMessage (emitted by GSS_Wrap). | |
| | | | |
| The mechanism OID for the SSV mechanism is: | | The mechanism OID for the SSV mechanism is: | |
| iso.org.dod.internet.private.enterprise.Michael Eisler.nfs.ssv_mech | | iso.org.dod.internet.private.enterprise.Michael Eisler.nfs.ssv_mech | |
| (1.3.6.1.4.1.28882.1.1). While the SSV mechanisms does not define | | (1.3.6.1.4.1.28882.1.1). While the SSV mechanisms does not define | |
| any initial context tokens, the OID can be used to let servers | | any initial context tokens, the OID can be used to let servers | |
| indicate that the SSV mechanism is acceptable whenever the client | | indicate that the SSV mechanism is acceptable whenever the client | |
| issues a SECINFO or SECINFO_NO_NAME operation (see Section 2.6). | | issues a SECINFO or SECINFO_NO_NAME operation (see Section 2.6). | |
| | | | |
|
| | | The SSV mechanism defines four subkeys dervived from the SSV value. | |
| | | Each time SET_SSV is invoked the subkeys are recalculated by the | |
| | | client and server. The four subkeys are calculated by from each of | |
| | | the valid ssv_subkey4 enumerated values. The calculation uses the | |
| | | HMAC ([12]), algorithm, using the current SSV as the key, the one way | |
| | | hash algorithm as negotiated by EXCHANGE_ID, and the input text as | |
| | | represented by the XDR encoded enumneration of type ssv_subkey4. | |
| | | | |
| | | /* Input for computing subkeys */ | |
| | | enum ssv_subkey4 { | |
| | | SSV4_SUBKEY_MIC_I2T = 1, | |
| | | SSV4_SUBKEY_MIC_T2I = 2, | |
| | | SSV4_SUBKEY_SEAL_I2T = 3, | |
| | | SSV4_SUBKEY_SEAL_T2I = 4 | |
| | | }; | |
| | | The subkey derived from SSV4_SUBKEY_MIC_I2T is used for calculating | |
| | | message integrity codes (MICs) that originate from the NFSv4.1 | |
| | | client, whether as part of a request over the fore channel, or a | |
| | | response over the backchannel. The subkey derived from SSV4_SUBKEY- | |
| | | MIST2I is used for MICs originating from the NFSv4.1 server. The | |
| | | subkey derived from SSV4_SUBKEY_SEAL_I2T is used for encryption text | |
| | | originating from the NFSv4.1 client and the subkey derived from | |
| | | SSV4_SUBKEY_SEAL_T2I is used for encryption text originating from the | |
| | | NFSv4.1 server. | |
| | | | |
| | | The field smt_hmac is an HMAC calculated by using the subkey derived | |
| | | from SSV4_SUBKEY_MIC_I2T or SSV4_SUBKEY_MIC_T2I as the key, the one | |
| | | way hash algorithm as negotiated by EXCHANGE_ID, and the input text | |
| | | as represented by data of type ssv_mic_plain_tkn4. The field | |
| | | smpt_ssv_seq is the same as smt_ssv_seq. The field smt_orig_plain is | |
| | | the input text as passed into GSS_GetMIC(). | |
| | | | |
| The PerMsgToken description is based on an XDR definition: | | The PerMsgToken description is based on an XDR definition: | |
| | | | |
| /* Input for computing smt_hmac */ | | /* Input for computing smt_hmac */ | |
| struct ssv_mic_plain_tkn4 { | | struct ssv_mic_plain_tkn4 { | |
| uint32_t smpt_ssv_seq; | | uint32_t smpt_ssv_seq; | |
| opaque smpt_orig_plain<>; | | opaque smpt_orig_plain<>; | |
| }; | | }; | |
| | | | |
| /* SSV GSS PerMsgToken token */ | | /* SSV GSS PerMsgToken token */ | |
| struct ssv_mic_tkn4 { | | struct ssv_mic_tkn4 { | |
| uint32_t smt_ssv_seq; | | uint32_t smt_ssv_seq; | |
| opaque smt_hmac<>; | | opaque smt_hmac<>; | |
| }; | | }; | |
| | | | |
| The token emitted by GSS_GetMIC() is XDR encoded and of XDR data type | | The token emitted by GSS_GetMIC() is XDR encoded and of XDR data type | |
| ssv_mic_tkn4. The field smt_ssv_seq comes from the SSV sequence | | ssv_mic_tkn4. The field smt_ssv_seq comes from the SSV sequence | |
|
| number which is equal to 1 after SET_SSV is called the first time on | | number which is equal to 1 after SET_SSV (Section 18.47) is called | |
| a client ID. Thereafter, it is incremented on each SET_SSV. Thus | | the first time on a client ID. Thereafter, it is incremented on each | |
| smt_ssv_seq represents the version of the SSV at the time | | SET_SSV. Thus smt_ssv_seq represents the version of the SSV at the | |
| GSS_GetMIC() was called. This allows the SSV to be changed without | | time GSS_GetMIC() was called. As noted in Section 18.35, the client | |
| serializing all RPC calls that use the SSV mechanism with SET_SSV | | and server can maintain multiple concurrent versions of the SSV. | |
| operations. | | This allows the SSV to be changed without serializing all RPC calls | |
| | | that use the SSV mechanism with SET_SSV operations. | |
| The field smt_hmac is an HMAC ([12]), calculated by using the current | | | |
| SSV as the key, the one way hash algorithm as negotiated by | | | |
| EXCHANGE_ID, and the input text as represented by data of type | | | |
| ssv_mic_plain_tkn4. The field smpt_ssv_seq is the same as | | | |
| smt_ssv_seq. The field smt_orig_plain is the input text as passed | | | |
| into GSS_GetMIC(). | | | |
| | | | |
| The SealedMessage description is based on an XDR definition: | | The SealedMessage description is based on an XDR definition: | |
| | | | |
| /* Input for computing ssct_encr_data and ssct_hmac */ | | /* Input for computing ssct_encr_data and ssct_hmac */ | |
| struct ssv_seal_plain_tkn4 { | | struct ssv_seal_plain_tkn4 { | |
| opaque sspt_confounder<>; | | opaque sspt_confounder<>; | |
| uint32_t sspt_ssv_seq; | | uint32_t sspt_ssv_seq; | |
| opaque sspt_orig_plain<>; | | opaque sspt_orig_plain<>; | |
| opaque sspt_pad<>; | | opaque sspt_pad<>; | |
| }; | | }; | |
| | | | |
| skipping to change at page 66, line 7 | | skipping to change at page 66, line 28 | |
| opaque ssct_hmac<>; | | opaque ssct_hmac<>; | |
| }; | | }; | |
| | | | |
| The token emitted by GSS_Wrap() is XDR encoded and of XDR data type | | The token emitted by GSS_Wrap() is XDR encoded and of XDR data type | |
| ssv_seal_cipher_tkn4. | | ssv_seal_cipher_tkn4. | |
| | | | |
| The ssct_ssv_seq field has the same meaning as smt_ssv_seq. | | The ssct_ssv_seq field has the same meaning as smt_ssv_seq. | |
| | | | |
| The ssct_encr_data field is the result of encrypting a value of the | | The ssct_encr_data field is the result of encrypting a value of the | |
| XDR encoded data type ssv_seal_plain_tkn4. The encryption key is the | | XDR encoded data type ssv_seal_plain_tkn4. The encryption key is the | |
|
| SSV, and the encryption algorithm is that negotiated by EXCHANGE_ID. | | subkey derived from SSV4_SUBKEY_SEAL_I2T or SSV4_SUBKEY_SEAL_T2I, and | |
| | | the encryption algorithm is that negotiated by EXCHANGE_ID. | |
| | | | |
| The ssct_iv field is the initialization vector (IV) for the | | The ssct_iv field is the initialization vector (IV) for the | |
| encryption algorithm (if applicable) and is sent in clear text. The | | encryption algorithm (if applicable) and is sent in clear text. The | |
| content and size of the IV MUST comply with specification of the | | content and size of the IV MUST comply with specification of the | |
| encryption algorithm. For example, the id-aes256-CBC algorithm MUST | | encryption algorithm. For example, the id-aes256-CBC algorithm MUST | |
| use a 16 octet initialization vector (IV) which MUST be unpredictable | | use a 16 octet initialization vector (IV) which MUST be unpredictable | |
| for each instance of a value of type ssv_seal_plain_tkn4 that is | | for each instance of a value of type ssv_seal_plain_tkn4 that is | |
| encrypted with a particular SSV key. | | encrypted with a particular SSV key. | |
| | | | |
| The ssct_hmac field is the result of computing an HMAC using value of | | The ssct_hmac field is the result of computing an HMAC using value of | |
| the XDR encoded data type ssv_seal_plain_tkn4 as the input text. The | | the XDR encoded data type ssv_seal_plain_tkn4 as the input text. The | |
|
| key is the SSV, and the one way hash algorithm is that negotiated by | | key is the subkey dervived from SSV4_SUBKEY_MIC_I2T or | |
| EXCHANGE_ID. | | SSV4_SUBKEY_MIC_T2I, and the one way hash algorithm is that | |
| | | negotiated by EXCHANGE_ID. | |
| | | | |
| The sspt_confounder field is a random value. | | The sspt_confounder field is a random value. | |
| | | | |
| The sspt_ssv_seq field is the same as ssvt_ssv_seq. | | The sspt_ssv_seq field is the same as ssvt_ssv_seq. | |
| | | | |
| The sspt_orig_plain field is the original plaintext as passed to | | The sspt_orig_plain field is the original plaintext as passed to | |
| GSS_Wrap(). | | GSS_Wrap(). | |
| | | | |
| The sspt_pad field is present to support encryption algorithms that | | The sspt_pad field is present to support encryption algorithms that | |
| require inputs to be in fixed sized blocks. The content of sspt_pad | | require inputs to be in fixed sized blocks. The content of sspt_pad | |
| | | | |
| skipping to change at page 67, line 6 | | skipping to change at page 67, line 34 | |
| total encoding of 16 octets. The total number of XDR encoded octets | | total encoding of 16 octets. The total number of XDR encoded octets | |
| is thus 8 + 4 + 20 + 16 = 48. | | is thus 8 + 4 + 20 + 16 = 48. | |
| | | | |
| GSS_Wrap() emits a token that is an XDR encoding of a value of data | | GSS_Wrap() emits a token that is an XDR encoding of a value of data | |
| type ssv_seal_cipher_tkn4. Note that regardless whether the caller | | type ssv_seal_cipher_tkn4. Note that regardless whether the caller | |
| of GSS_Wrap() requests confidentiality or not, the token always has | | of GSS_Wrap() requests confidentiality or not, the token always has | |
| confidentiality. This is because the SSV mechanism is for | | confidentiality. This is because the SSV mechanism is for | |
| RPCSEC_GSS, and RPCSEC_GSS never produces GSS_wrap() tokens without | | RPCSEC_GSS, and RPCSEC_GSS never produces GSS_wrap() tokens without | |
| confidentiality. | | confidentiality. | |
| | | | |
|
| Effectively there is a single GSS context for all RPCSEC_GSS handles | | Effectively there is a single GSS context for a single client ID. | |
| that have been created on a session. And all sessions associated | | All RPCSEC_GSS handles share the same GSS context. SSV GSS contexts | |
| with a a client ID share the same SSV. SSV GSS contexts do not | | do not expire except when the SSV is destroyed (causes would include | |
| expire except when the SSV is destroyed (causes would include the | | the client ID being destroyed or a server restart). Since one | |
| client ID being destroyed or a server restart). Since one purpose of | | purpose of context expiration is to replace keys that have been in | |
| context expiration is to replace keys that have been in use for "too | | use for "too long" hence vulnerable to compromise by brute force or | |
| long" hence vulnerable to compromise by brute force or accident, the | | accident, the client can issue periodic SET_SSV operations, by | |
| client can issue periodic SET_SSV operations, by cycling through | | cycling through different users' RPCSEC_GSS credentials. This way | |
| different users' RPCSEC_GSS credentials. This way the SSV is | | the SSV is replaced without destroying the SSV's GSS contexts. | |
| replaced without destroying the SSV's GSS contexts. If for some | | | |
| reason SSV RPCSEC_GSS handles expire, the EXCHANGE_ID operation can | | | |
| be used to create more SSV RPCSEC_GSS handles. | | | |
| | | | |
|
| The client MUST establish an SSV via SET_SSV before the GSS context | | SSV RPCSEC_GSS handles can be expired or deleted by the server at any | |
| can be used to emit tokens from GSS_Wrap() and GSS_GetMIC(). If | | time and the EXCHANGE_ID operation can be used to create more SSV | |
| SET_SSV has not been successfully called, attempts to emit tokens | | RPCSEC_GSS handles. | |
| | | | |
| | | The client MUST establish an SSV via SET_SSV before the SSV GSS | |
| | | context can be used to emit tokens from GSS_Wrap() and GSS_GetMIC(). | |
| | | If SET_SSV has not been successfully called, attempts to emit tokens | |
| MUST fail. | | MUST fail. | |
| | | | |
| The SSV mechanism does not support replay detection and sequencing in | | The SSV mechanism does not support replay detection and sequencing in | |
|
| its tokens because RPCSEC_GSS does not use those features (Section | | its tokens because RPCSEC_GSS does not use those features (See | |
| 5.2.2 "Context Creation Requests" in [5]). | | Section 5.2.2 "Context Creation Requests" in [5]). | |
| | | | |
|
| 2.10.8. Session Mechanics - Steady State | | 2.10.9. Session Mechanics - Steady State | |
| | | | |
|
| 2.10.8.1. Obligations of the Server | | 2.10.9.1. Obligations of the Server | |
| | | | |
| The server has the primary obligation to monitor the state of | | The server has the primary obligation to monitor the state of | |
| backchannel resources that the client has created for the server | | backchannel resources that the client has created for the server | |
| (RPCSEC_GSS contexts and backchannel connections). If these | | (RPCSEC_GSS contexts and backchannel connections). If these | |
| resources vanish, the server takes action as specified in | | resources vanish, the server takes action as specified in | |
|
| Section 2.10.9.2. | | Section 2.10.10.2. | |
| | | | |
|
| 2.10.8.2. Obligations of the Client | | 2.10.9.2. Obligations of the Client | |
| | | | |
| The client SHOULD honor the following obligations in order to utilize | | The client SHOULD honor the following obligations in order to utilize | |
| the session: | | the session: | |
| | | | |
| o Keep a necessary session from going idle on the server. A client | | o Keep a necessary session from going idle on the server. A client | |
| that requires a session, but nonetheless is not sending operations | | that requires a session, but nonetheless is not sending operations | |
| risks having the session be destroyed by the server. This is | | risks having the session be destroyed by the server. This is | |
| because sessions consume resources, and resource limitations may | | because sessions consume resources, and resource limitations may | |
| force the server to cull a session that has not been used for long | | force the server to cull a session that has not been used for long | |
| time. [[Comment.6: Tom Talpey disagrees and thinks a server can | | time. [[Comment.6: Tom Talpey disagrees and thinks a server can | |
| | | | |
| skipping to change at page 68, line 4 | | skipping to change at page 68, line 33 | |
| | | | |
| o Keep a necessary session from going idle on the server. A client | | o Keep a necessary session from going idle on the server. A client | |
| that requires a session, but nonetheless is not sending operations | | that requires a session, but nonetheless is not sending operations | |
| risks having the session be destroyed by the server. This is | | risks having the session be destroyed by the server. This is | |
| because sessions consume resources, and resource limitations may | | because sessions consume resources, and resource limitations may | |
| force the server to cull a session that has not been used for long | | force the server to cull a session that has not been used for long | |
| time. [[Comment.6: Tom Talpey disagrees and thinks a server can | | time. [[Comment.6: Tom Talpey disagrees and thinks a server can | |
| never cull a session. Mike Eisler doesn't know what the server is | | never cull a session. Mike Eisler doesn't know what the server is | |
| supposed to do when it accumulates a zillion reply caches that no | | supposed to do when it accumulates a zillion reply caches that no | |
| client has touched in a century. :-)]] | | client has touched in a century. :-)]] | |
|
| | | | |
| o Destroy the session when not needed. If a client has multiple | | o Destroy the session when not needed. If a client has multiple | |
| sessions and one of them has no requests waiting for replies, and | | sessions and one of them has no requests waiting for replies, and | |
| has been idle for some period of time, it SHOULD destroy the | | has been idle for some period of time, it SHOULD destroy the | |
| session. | | session. | |
| | | | |
|
| o Maintain GSS contexts for the callback channel. If the client | | o Maintain GSS contexts for the backchannel. If the client requires | |
| requires the server to use the RPCSEC_GSS security flavor for | | the server to use the RPCSEC_GSS security flavor for callbacks, | |
| callbacks, then it needs to be sure the contexts handed to the | | then it needs to be sure the contexts handed to the server via | |
| server via BACKCHANNEL_CTL are unexpired. | | BACKCHANNEL_CTL are unexpired. | |
| | | | |
| o Preserve a connection for a backchannel. The server requires a | | o Preserve a connection for a backchannel. The server requires a | |
| backchannel in order to gracefully recall recallable state, or | | backchannel in order to gracefully recall recallable state, or | |
| notify the client of certain events. Note that if the connection | | notify the client of certain events. Note that if the connection | |
| is not being used for the fore channel, there is no way the client | | is not being used for the fore channel, there is no way the client | |
|
| tell if the connection is still alive (e.g., the server rebooted | | tell if the connection is still alive (e.g., the server restarted | |
| without sending a disconnect). The onus is on the server, not the | | without sending a disconnect). The onus is on the server, not the | |
| client, to determine if the backchannel's connection is alive, and | | client, to determine if the backchannel's connection is alive, and | |
| to indicate in the response to a SEQUENCE operation when the last | | to indicate in the response to a SEQUENCE operation when the last | |
| connection associated with a session's backchannel has | | connection associated with a session's backchannel has | |
| disconnected. | | disconnected. | |
| | | | |
|
| 2.10.8.3. Steps the Client Takes To Establish a Session | | 2.10.9.3. Steps the Client Takes To Establish a Session | |
| | | | |
| If the client does not have a client ID, the client issues | | If the client does not have a client ID, the client issues | |
| EXCHANGE_ID to establish a client ID. If it opts for SP4_MACH_CRED | | EXCHANGE_ID to establish a client ID. If it opts for SP4_MACH_CRED | |
| or SP4_SSV protection, in the spo_must_enforce list of operations, it | | or SP4_SSV protection, in the spo_must_enforce list of operations, it | |
| SHOULD at minimum specify: CREATE_SESSION, DESTROY_SESSION, | | SHOULD at minimum specify: CREATE_SESSION, DESTROY_SESSION, | |
| BIND_CONN_TO_SESSION, BACKCHANNEL_CTL, and DESTROY_CLIENTID. If opts | | BIND_CONN_TO_SESSION, BACKCHANNEL_CTL, and DESTROY_CLIENTID. If opts | |
| for SP4_SSV protection, the client needs to ask for SSV-based | | for SP4_SSV protection, the client needs to ask for SSV-based | |
| RPCSEC_GSS handles. | | RPCSEC_GSS handles. | |
| | | | |
| The client uses the client ID to issue a CREATE_SESSION on a | | The client uses the client ID to issue a CREATE_SESSION on a | |
| connection to the server. The results of CREATE_SESSION indicate | | connection to the server. The results of CREATE_SESSION indicate | |
| whether the server will persist the session reply cache through a | | whether the server will persist the session reply cache through a | |
|
| server reboot or not, and the client notes this for future reference. | | server restarted or not, and the client notes this for future | |
| | | reference. | |
| | | | |
| If the client specified SP4_SSV state protection when the client ID | | If the client specified SP4_SSV state protection when the client ID | |
| was created, then it SHOULD issue SET_SSV in the first COMPOUND after | | was created, then it SHOULD issue SET_SSV in the first COMPOUND after | |
| the session is created. Each time a new principal goes to use the | | the session is created. Each time a new principal goes to use the | |
| client ID, it SHOULD issue a SET_SSV again. | | client ID, it SHOULD issue a SET_SSV again. | |
| | | | |
| If the client wants to use delegations, layouts, directory | | If the client wants to use delegations, layouts, directory | |
| notifications, or any other state that requires a backchannel, then | | notifications, or any other state that requires a backchannel, then | |
| it must add a connection to the backchannel if CREATE_SESSION did not | | it must add a connection to the backchannel if CREATE_SESSION did not | |
| already do so. The client creates a connection, and calls | | already do so. The client creates a connection, and calls | |
| | | | |
| skipping to change at page 69, line 18 | | skipping to change at page 69, line 48 | |
| | | | |
| If the client wants to use additional connections for the | | If the client wants to use additional connections for the | |
| backchannel, then it must call BIND_CONN_TO_SESSION on each | | backchannel, then it must call BIND_CONN_TO_SESSION on each | |
| connection it wants to use with the session. If the client wants to | | connection it wants to use with the session. If the client wants to | |
| use additional connections for the fore channel, then it must call | | use additional connections for the fore channel, then it must call | |
| BIND_CONN_TO_SESSION if it specified SP4_SSV or SP4_MACH_CRED state | | BIND_CONN_TO_SESSION if it specified SP4_SSV or SP4_MACH_CRED state | |
| protection when the client ID was created. | | protection when the client ID was created. | |
| | | | |
| At this point the session has reached steady state. | | At this point the session has reached steady state. | |
| | | | |
|
| 2.10.9. Session Mechanics - Recovery | | 2.10.10. Session Mechanics - Recovery | |
| | | | |
|
| 2.10.9.1. Events Requiring Client Action | | 2.10.10.1. Events Requiring Client Action | |
| | | | |
| The following events require client action to recover. | | The following events require client action to recover. | |
| | | | |
|
| 2.10.9.1.1. RPCSEC_GSS Context Loss by Callback Path | | 2.10.10.1.1. RPCSEC_GSS Context Loss by Callback Path | |
| | | | |
| If all RPCSEC_GSS contexts granted by the client to the server for | | If all RPCSEC_GSS contexts granted by the client to the server for | |
| callback use have expired, the client MUST establish a new context | | callback use have expired, the client MUST establish a new context | |
| via BACKCHANNEL_CTL. The sr_status_flags field of the SEQUENCE | | via BACKCHANNEL_CTL. The sr_status_flags field of the SEQUENCE | |
| results indicates when callback contexts are nearly expired, or fully | | results indicates when callback contexts are nearly expired, or fully | |
| expired (see Section 18.46.4). | | expired (see Section 18.46.4). | |
| | | | |
|
| 2.10.9.1.2. Connection Loss | | 2.10.10.1.2. Connection Loss | |
| | | | |
| If the client loses the last connection of the session, and if wants | | If the client loses the last connection of the session, and if wants | |
| to retain the session, then it must create a new connection, and if, | | to retain the session, then it must create a new connection, and if, | |
| when the client ID was created, BIND_CONN_TO_SESSION was specified in | | when the client ID was created, BIND_CONN_TO_SESSION was specified in | |
| the spo_must_enforce list, the client MUST use BIND_CONNN_TO_SESSION | | the spo_must_enforce list, the client MUST use BIND_CONNN_TO_SESSION | |
| to associate the connection with the session. | | to associate the connection with the session. | |
| | | | |
| If there was a request outstanding at the time the of connection | | If there was a request outstanding at the time the of connection | |
| loss, then if client wants to continue to use the session it MUST | | loss, then if client wants to continue to use the session it MUST | |
| retry the request, as described in Section 2.10.5.2. Note that it is | | retry the request, as described in Section 2.10.5.2. Note that it is | |
| | | | |
| skipping to change at page 70, line 10 | | skipping to change at page 70, line 43 | |
| disconnect. | | disconnect. | |
| | | | |
| If the connection that was lost was the last one associated with the | | If the connection that was lost was the last one associated with the | |
| backchannel, and the client wants to retain the backchannel and/or | | backchannel, and the client wants to retain the backchannel and/or | |
| not put recallable state subject to revocation, the client must | | not put recallable state subject to revocation, the client must | |
| reconnect, and if it does, it MUST associate the connection to the | | reconnect, and if it does, it MUST associate the connection to the | |
| session and backchannel via BIND_CONN_TO_SESSION. The server SHOULD | | session and backchannel via BIND_CONN_TO_SESSION. The server SHOULD | |
| indicate when it has no callback connection via the sr_status_flags | | indicate when it has no callback connection via the sr_status_flags | |
| result from SEQUENCE. | | result from SEQUENCE. | |
| | | | |
|
| 2.10.9.1.3. Backchannel GSS Context Loss | | 2.10.10.1.3. Backchannel GSS Context Loss | |
| | | | |
| Via the sr_status_flags result of the SEQUENCE operation or other | | Via the sr_status_flags result of the SEQUENCE operation or other | |
| means, the client will learn if some or all of the RPCSEC_GSS | | means, the client will learn if some or all of the RPCSEC_GSS | |
| contexts it assigned to the backchannel have been lost. If the | | contexts it assigned to the backchannel have been lost. If the | |
| client wants to the retain the backchannel and/or not put recallable | | client wants to the retain the backchannel and/or not put recallable | |
| state subjection to revocation, the client must use BACKCHANNEL_CTL | | state subjection to revocation, the client must use BACKCHANNEL_CTL | |
| to assign new contexts. | | to assign new contexts. | |
| | | | |
|
| 2.10.9.1.4. Loss of Session | | 2.10.10.1.4. Loss of Session | |
| | | | |
| The replier might lose a record of the session. Causes include: | | The replier might lose a record of the session. Causes include: | |
| | | | |
|
| o Replier crash and reboot | | o Replier failure and restart | |
| | | | |
| o A catastrophe that causes the reply cache to be corrupted or lost | | o A catastrophe that causes the reply cache to be corrupted or lost | |
| on the media it was stored on. This applies even if the replier | | on the media it was stored on. This applies even if the replier | |
| indicated in the CREATE_SESSION results that it would persist the | | indicated in the CREATE_SESSION results that it would persist the | |
| cache. | | cache. | |
| | | | |
| o The server purges the session of a client that has been inactive | | o The server purges the session of a client that has been inactive | |
| for a very extended period of time. | | for a very extended period of time. | |
| | | | |
| Loss of reply cache is equivalent to loss of session. The replier | | Loss of reply cache is equivalent to loss of session. The replier | |
| indicates loss of session to the requester by returning | | indicates loss of session to the requester by returning | |
| NFS4ERR_BADSESSION on the next operation that uses the sessionid that | | NFS4ERR_BADSESSION on the next operation that uses the sessionid that | |
| refers to the lost session. | | refers to the lost session. | |
| | | | |
|
| After an event like a server reboot, the client may have lost its | | After an event like a server restart, the client may have lost its | |
| connections. The client assumes for the moment that the session has | | connections. The client assumes for the moment that the session has | |
| not been lost. It reconnects, and if it specified connection | | not been lost. It reconnects, and if it specified connection | |
| association enforcement when the session was created, it invokes | | association enforcement when the session was created, it invokes | |
| BIND_CONN_TO_SESSION using the sessionid. Otherwise, it invokes | | BIND_CONN_TO_SESSION using the sessionid. Otherwise, it invokes | |
| SEQUENCE. If BIND_CONN_TO_SESSION or SEQUENCE returns | | SEQUENCE. If BIND_CONN_TO_SESSION or SEQUENCE returns | |
| NFS4ERR_BADSESSION, the client knows the session was lost. If the | | NFS4ERR_BADSESSION, the client knows the session was lost. If the | |
| connection survives session loss, then the next SEQUENCE operation | | connection survives session loss, then the next SEQUENCE operation | |
| the client issues over the connection will get back | | the client issues over the connection will get back | |
| NFS4ERR_BADSESSION. The client again knows the session was lost. | | NFS4ERR_BADSESSION. The client again knows the session was lost. | |
| | | | |
| | | | |
| skipping to change at page 71, line 13 | | skipping to change at page 71, line 47 | |
| have been performed on the server at the time of session loss. The | | have been performed on the server at the time of session loss. The | |
| client has no general way to recover from this. | | client has no general way to recover from this. | |
| | | | |
| Note that loss of session does not imply loss of lock, open, | | Note that loss of session does not imply loss of lock, open, | |
| delegation, or layout state because locks, opens, delegations, and | | delegation, or layout state because locks, opens, delegations, and | |
| layouts are tied to the client ID and depend on the client ID, not | | layouts are tied to the client ID and depend on the client ID, not | |
| the session. Nor does loss of lock, open, delegation, or layout | | the session. Nor does loss of lock, open, delegation, or layout | |
| state imply loss of session state, because the session depends on the | | state imply loss of session state, because the session depends on the | |
| client ID; loss of client ID however does imply loss of session, | | client ID; loss of client ID however does imply loss of session, | |
| lock, open, delegation, and layout state. See Section 8.4.2. A | | lock, open, delegation, and layout state. See Section 8.4.2. A | |
|
| session can survive a server reboot, but lock recovery may still be | | session can survive a server restart, but lock recovery may still be | |
| needed. | | needed. | |
| | | | |
| It is possible CREATE_SESSION will fail with NFS4ERR_STALE_CLIENTID | | It is possible CREATE_SESSION will fail with NFS4ERR_STALE_CLIENTID | |
|
| (for example the server reboots and does not preserve client ID | | (for example the server restarts and does not preserve client ID | |
| state). If so, the client needs to call EXCHANGE_ID, followed by | | state). If so, the client needs to call EXCHANGE_ID, followed by | |
| CREATE_SESSION. | | CREATE_SESSION. | |
| | | | |
|
| 2.10.9.2. Events Requiring Server Action | | 2.10.10.2. Events Requiring Server Action | |
| | | | |
| The following events require server action to recover. | | The following events require server action to recover. | |
| | | | |
|
| 2.10.9.2.1. Client Crash and Reboot | | 2.10.10.2.1. Client Crash and Restart | |
| | | | |
|
| As described in Section 18.35, a rebooted client issues EXCHANGE_ID | | As described in Section 18.35, a restarted client issues EXCHANGE_ID | |
| in such a way it causes the server to delete any sessions it had. | | in such a way it causes the server to delete any sessions it had. | |
| | | | |
|
| 2.10.9.2.2. Client Crash with No Reboot | | 2.10.10.2.2. Client Crash with No Restart | |
| | | | |
| If a client crashes and never comes back, it will never issue | | If a client crashes and never comes back, it will never issue | |
| EXCHANGE_ID with its old client owner. Thus the server has session | | EXCHANGE_ID with its old client owner. Thus the server has session | |
| state that will never be used again. After an extended period of | | state that will never be used again. After an extended period of | |
| time and if the server has resource constraints, it MAY destroy the | | time and if the server has resource constraints, it MAY destroy the | |
| old session as well as locking state. | | old session as well as locking state. | |
| | | | |
|
| 2.10.9.2.3. Extended Network Partition | | 2.10.10.2.3. Extended Network Partition | |
| | | | |
| To the server, the extended network partition may be no different | | To the server, the extended network partition may be no different | |
|
| from a client crash with no reboot (see Section 2.10.9.2.2). Unless | | from a client crash with no restart (see Section 2.10.10.2.2). | |
| the server can discern that there is a network partition, it is free | | Unless the server can discern that there is a network partition, it | |
| to treat the situation as if the client has crashed permanently. | | is free to treat the situation as if the client has crashed | |
| | | permanently. | |
| | | | |
|
| 2.10.9.2.4. Backchannel Connection Loss | | 2.10.10.2.4. Backchannel Connection Loss | |
| | | | |
| If there were callback requests outstanding at the time of a | | If there were callback requests outstanding at the time of a | |
| connection loss, then the server MUST retry the request, as described | | connection loss, then the server MUST retry the request, as described | |
| in Section 2.10.5.2. Note that it is not necessary to retry requests | | in Section 2.10.5.2. Note that it is not necessary to retry requests | |
| over a connection with the same source network address or the same | | over a connection with the same source network address or the same | |
| destination network address as the lost connection. As long as the | | destination network address as the lost connection. As long as the | |
| sessionid, slot id, and sequence id in the retry match that of the | | sessionid, slot id, and sequence id in the retry match that of the | |
| original request, the callback target will recognize the request as a | | original request, the callback target will recognize the request as a | |
| retry even if it did see the request prior to disconnect. | | retry even if it did see the request prior to disconnect. | |
| | | | |
| If the connection lost is the last one associated with the | | If the connection lost is the last one associated with the | |
| backchannel, then the server MUST indicate that in the | | backchannel, then the server MUST indicate that in the | |
| sr_status_flags field of every SEQUENCE reply until the backchannel | | sr_status_flags field of every SEQUENCE reply until the backchannel | |
| is reestablished. There are two situations each of which use | | is reestablished. There are two situations each of which use | |
| different status flags: no connectivity for the session's | | different status flags: no connectivity for the session's | |
| backchannel, and no connectivity for any session backchannel of the | | backchannel, and no connectivity for any session backchannel of the | |
| client. See Section 18.46 for a description of the appropriate flags | | client. See Section 18.46 for a description of the appropriate flags | |
| in sr_status_flags. | | in sr_status_flags. | |
| | | | |
|
| 2.10.9.2.5. GSS Context Loss | | 2.10.10.2.5. GSS Context Loss | |
| | | | |
| The server SHOULD monitor when the number RPCSEC_GSS contexts | | The server SHOULD monitor when the number RPCSEC_GSS contexts | |
| assigned to the backchannel reaches one, and that one context is near | | assigned to the backchannel reaches one, and that one context is near | |
| expiry (i.e. between one and two periods of lease time), and indicate | | expiry (i.e. between one and two periods of lease time), and indicate | |
| so in the sr_status_flags field of all SEQUENCE replies. The server | | so in the sr_status_flags field of all SEQUENCE replies. The server | |
| MUST indicate when the all of the backchannel's assigned RPCSEC_GSS | | MUST indicate when the all of the backchannel's assigned RPCSEC_GSS | |
| contexts have expired in the sr_status_flags field of all SEQUENCE | | contexts have expired in the sr_status_flags field of all SEQUENCE | |
| replies. | | replies. | |
| | | | |
|
| 2.10.10. Parallel NFS and Sessions | | 2.10.11. Parallel NFS and Sessions | |
| | | | |
| A client and server can potentially be a non-pNFS implementation, a | | A client and server can potentially be a non-pNFS implementation, a | |
| metadata server implementation, a data server implementation, or two | | metadata server implementation, a data server implementation, or two | |
| or three types of implementations. The EXCHGID4_FLAG_USE_NON_PNFS, | | or three types of implementations. The EXCHGID4_FLAG_USE_NON_PNFS, | |
| EXCHGID4_FLAG_USE_PNFS_MDS, and EXCHGID4_FLAG_USE_PNFS_DS flags (not | | EXCHGID4_FLAG_USE_PNFS_MDS, and EXCHGID4_FLAG_USE_PNFS_DS flags (not | |
| mutually exclusive) are passed in the EXCHANGE_ID arguments and | | mutually exclusive) are passed in the EXCHANGE_ID arguments and | |
| results to allow the client to indicate how it wants to use sessions | | results to allow the client to indicate how it wants to use sessions | |
| created under the client ID, and to allow the server to indicate how | | created under the client ID, and to allow the server to indicate how | |
| it will allow the sessions to be used. See Section 14.1 for pNFS | | it will allow the sessions to be used. See Section 14.1 for pNFS | |
| sessions considerations. | | sessions considerations. | |
| | | | |
| skipping to change at page 76, line 12 | | skipping to change at page 76, line 42 | |
| This data type represents additional information for the device file | | This data type represents additional information for the device file | |
| types NF4CHR and NF4BLK. | | types NF4CHR and NF4BLK. | |
| | | | |
| 3.2.5. fsid4 | | 3.2.5. fsid4 | |
| | | | |
| struct fsid4 { | | struct fsid4 { | |
| uint64_t major; | | uint64_t major; | |
| uint64_t minor; | | uint64_t minor; | |
| }; | | }; | |
| | | | |
|
| 3.2.6. fs_location4 | | 3.2.6. chg_policy4 | |
| | | | |
| | | struct change_policy4 { | |
| | | uint64_t cp_major; | |
| | | uint64_t cp_minor; | |
| | | }; | |
| | | | |
| | | The chg_policy4 data type is used for the change_policy recommended | |
| | | attribute. It provides change sequencing indication analogous to the | |
| | | change attribute. To enable the server to present a value valid | |
| | | across server re-initialization without requiring persistent storage, | |
| | | two 64-bit quantities are used, allowing one to be a server instance | |
| | | id and the second to be incremented non-persistently, within a given | |
| | | server instance. | |
| | | | |
| | | 3.2.7. fs_location4 | |
| | | | |
| struct fs_location4 { | | struct fs_location4 { | |
| utf8str_cis server<>; | | utf8str_cis server<>; | |
| pathname4 rootpath; | | pathname4 rootpath; | |
| }; | | }; | |
| | | | |
|
| 3.2.7. fs_locations4 | | 3.2.8. fs_locations4 | |
| | | | |
| struct fs_locations4 { | | struct fs_locations4 { | |
| pathname4 fs_root; | | pathname4 fs_root; | |
| fs_location4 locations<>; | | fs_location4 locations<>; | |
| }; | | }; | |
| | | | |
| The fs_location4 and fs_locations4 data types are used for the | | The fs_location4 and fs_locations4 data types are used for the | |
| fs_locations recommended attribute which is used for migration and | | fs_locations recommended attribute which is used for migration and | |
| replication support. | | replication support. | |
| | | | |
|
| 3.2.8. fattr4 | | 3.2.9. fattr4 | |
| | | | |
| struct fattr4 { | | struct fattr4 { | |
| bitmap4 attrmask; | | bitmap4 attrmask; | |
| attrlist4 attr_vals; | | attrlist4 attr_vals; | |
| }; | | }; | |
| | | | |
| The fattr4 structure is used to represent file and directory | | The fattr4 structure is used to represent file and directory | |
| attributes. | | attributes. | |
| | | | |
| The bitmap is a counted array of 32 bit integers used to contain bit | | The bitmap is a counted array of 32 bit integers used to contain bit | |
| values. The position of the integer in the array that contains bit n | | values. The position of the integer in the array that contains bit n | |
| can be computed from the expression (n / 32) and its bit within that | | can be computed from the expression (n / 32) and its bit within that | |
| integer is (n mod 32). | | integer is (n mod 32). | |
| | | | |
| 0 1 | | 0 1 | |
| +-----------+-----------+-----------+-- | | +-----------+-----------+-----------+-- | |
| | count | 31 .. 0 | 63 .. 32 | | | | count | 31 .. 0 | 63 .. 32 | | |
| +-----------+-----------+-----------+-- | | +-----------+-----------+-----------+-- | |
| | | | |
|
| 3.2.9. change_info4 | | 3.2.10. change_info4 | |
| | | | |
| struct change_info4 { | | struct change_info4 { | |
| bool atomic; | | bool atomic; | |
| changeid4 before; | | changeid4 before; | |
| changeid4 after; | | changeid4 after; | |
| }; | | }; | |
| | | | |
| This structure is used with the CREATE, LINK, REMOVE, RENAME | | This structure is used with the CREATE, LINK, REMOVE, RENAME | |
| operations to let the client know the value of the change attribute | | operations to let the client know the value of the change attribute | |
| for the directory in which the target file system object resides. | | for the directory in which the target file system object resides. | |
| | | | |
|
| 3.2.10. netaddr4 | | 3.2.11. netaddr4 | |
| | | | |
| struct netaddr4 { | | struct netaddr4 { | |
| /* see struct rpcb in RFC 1833 */ | | /* see struct rpcb in RFC 1833 */ | |
| string na_r_netid<>; /* network id */ | | string na_r_netid<>; /* network id */ | |
| string na_r_addr<>; /* universal address */ | | string na_r_addr<>; /* universal address */ | |
| }; | | }; | |
| | | | |
| The netaddr4 structure is used to identify TCP/IP based endpoints. | | The netaddr4 structure is used to identify TCP/IP based endpoints. | |
| The r_netid and r_addr fields are specified in RFC1833 [26], but they | | The r_netid and r_addr fields are specified in RFC1833 [26], but they | |
| are underspecified in RFC1833 [26] as far as what they should look | | are underspecified in RFC1833 [26] as far as what they should look | |
| | | | |
| skipping to change at page 78, line 20 | | skipping to change at page 79, line 20 | |
| representing an IPv6 address as defined in Section 2.2 of RFC1884 | | representing an IPv6 address as defined in Section 2.2 of RFC1884 | |
| [13]. Additionally, the two alternative forms specified in Section | | [13]. Additionally, the two alternative forms specified in Section | |
| 2.2 of RFC1884 [13] are also acceptable. | | 2.2 of RFC1884 [13] are also acceptable. | |
| | | | |
| For TCP over IPv6 the value of r_netid is the string "tcp6". For UDP | | For TCP over IPv6 the value of r_netid is the string "tcp6". For UDP | |
| over IPv6 the value of r_netid is the string "udp6". That this | | over IPv6 the value of r_netid is the string "udp6". That this | |
| document specifies the universal address and netid for UDP/IPv6 does | | document specifies the universal address and netid for UDP/IPv6 does | |
| not imply that UDP/IPv6 is a legal transport for NFSv4.1 (see | | not imply that UDP/IPv6 is a legal transport for NFSv4.1 (see | |
| Section 2.9). | | Section 2.9). | |
| | | | |
|
| 3.2.11. state_owner4 | | 3.2.12. state_owner4 | |
| | | | |
| struct state_owner4 { | | struct state_owner4 { | |
| clientid4 clientid; | | clientid4 clientid; | |
| opaque owner<NFS4_OPAQUE_LIMIT>; | | opaque owner<NFS4_OPAQUE_LIMIT>; | |
| }; | | }; | |
| | | | |
| typedef state_owner4 open_owner4; | | typedef state_owner4 open_owner4; | |
| typedef state_owner4 lock_owner4; | | typedef state_owner4 lock_owner4; | |
| | | | |
| The state_owner4 data type is the base type for the open_owner4 | | The state_owner4 data type is the base type for the open_owner4 | |
|
| Section 3.2.11.1 and lock_owner4 Section 3.2.11.2. NFS4_OPAQUE_LIMIT | | Section 3.2.12.1 and lock_owner4 Section 3.2.12.2. NFS4_OPAQUE_LIMIT | |
| is defined as 1024. | | is defined as 1024. | |
| | | | |
|
| 3.2.11.1. open_owner4 | | 3.2.12.1. open_owner4 | |
| | | | |
| This structure is used to identify the owner of open state. | | This structure is used to identify the owner of open state. | |
| | | | |
|
| 3.2.11.2. lock_owner4 | | 3.2.12.2. lock_owner4 | |
| | | | |
| This structure is used to identify the owner of file locking state. | | This structure is used to identify the owner of file locking state. | |
| | | | |
|
| 3.2.12. open_to_lock_owner4 | | 3.2.13. open_to_lock_owner4 | |
| | | | |
| struct open_to_lock_owner4 { | | struct open_to_lock_owner4 { | |
| seqid4 open_seqid; | | seqid4 open_seqid; | |
| stateid4 open_stateid; | | stateid4 open_stateid; | |
| seqid4 lock_seqid; | | seqid4 lock_seqid; | |
| lock_owner4 lock_owner; | | lock_owner4 lock_owner; | |
| }; | | }; | |
| | | | |
| This structure is used for the first LOCK operation done for an | | This structure is used for the first LOCK operation done for an | |
| open_owner4. It provides both the open_stateid and lock_owner such | | open_owner4. It provides both the open_stateid and lock_owner such | |
| that the transition is made from a valid open_stateid sequence to | | that the transition is made from a valid open_stateid sequence to | |
| that of the new lock_stateid sequence. Using this mechanism avoids | | that of the new lock_stateid sequence. Using this mechanism avoids | |
| the confirmation of the lock_owner/lock_seqid pair since it is tied | | the confirmation of the lock_owner/lock_seqid pair since it is tied | |
| to established state in the form of the open_stateid/open_seqid. | | to established state in the form of the open_stateid/open_seqid. | |
| | | | |
|
| 3.2.13. stateid4 | | 3.2.14. stateid4 | |
| | | | |
| struct stateid4 { | | struct stateid4 { | |
| uint32_t seqid; | | uint32_t seqid; | |
| opaque other[12]; | | opaque other[12]; | |
| }; | | }; | |
| | | | |
| This structure is used for the various state sharing mechanisms | | This structure is used for the various state sharing mechanisms | |
| between the client and server. For the client, this data structure | | between the client and server. For the client, this data structure | |
| is read-only. The starting value of the seqid field is undefined. | | is read-only. The starting value of the seqid field is undefined. | |
| The server is required to increment the seqid field monotonically at | | The server is required to increment the seqid field monotonically at | |
| each transition of the stateid. This is important since the client | | each transition of the stateid. This is important since the client | |
| will inspect the seqid in OPEN stateids to determine the order of | | will inspect the seqid in OPEN stateids to determine the order of | |
| OPEN processing done by the server. | | OPEN processing done by the server. | |
| | | | |
|
| 3.2.14. layouttype4 | | 3.2.15. layouttype4 | |
| | | | |
| enum layouttype4 { | | enum layouttype4 { | |
| LAYOUT4_NFSV4_1_FILES = 1, | | LAYOUT4_NFSV4_1_FILES = 1, | |
| LAYOUT4_OSD2_OBJECTS = 2, | | LAYOUT4_OSD2_OBJECTS = 2, | |
| LAYOUT4_BLOCK_VOLUME = 3 | | LAYOUT4_BLOCK_VOLUME = 3 | |
| }; | | }; | |
| | | | |
| A layout type specifies the layout being used. The implication is | | A layout type specifies the layout being used. The implication is | |
| that clients have "layout drivers" that support one or more layout | | that clients have "layout drivers" that support one or more layout | |
| types. The file server advertises the layout types it supports | | types. The file server advertises the layout types it supports | |
| | | | |
| skipping to change at page 80, line 5 | | skipping to change at page 81, line 5 | |
| globally unique and are assigned according to the description in | | globally unique and are assigned according to the description in | |
| Section 22.1; they are maintained by IANA. Types within the range | | Section 22.1; they are maintained by IANA. Types within the range | |
| 0x80000000-0xFFFFFFFF are site specific and for "private use" only. | | 0x80000000-0xFFFFFFFF are site specific and for "private use" only. | |
| | | | |
| The LAYOUT4_NFSV4_1_FILES enumeration specifies that the NFSv4.1 file | | The LAYOUT4_NFSV4_1_FILES enumeration specifies that the NFSv4.1 file | |
| layout type is to be used. The LAYOUT4_OSD2_OBJECTS enumeration | | layout type is to be used. The LAYOUT4_OSD2_OBJECTS enumeration | |
| specifies that the object layout, as defined in [29], is to be used. | | specifies that the object layout, as defined in [29], is to be used. | |
| Similarly, the LAYOUT4_BLOCK_VOLUME enumeration that the block/volume | | Similarly, the LAYOUT4_BLOCK_VOLUME enumeration that the block/volume | |
| layout, as defined in [30], is to be used. | | layout, as defined in [30], is to be used. | |
| | | | |
|
| 3.2.15. deviceid4 | | 3.2.16. deviceid4 | |
| | | | |
|
| typedef uint32_t deviceid4; | | typedef uint64_t deviceid4; | |
| | | | |
| Layout information includes device IDs that specify a storage device | | Layout information includes device IDs that specify a storage device | |
| through a compact handle. Addressing and type information is | | through a compact handle. Addressing and type information is | |
| obtained with the GETDEVICEINFO operation. A client must not assume | | obtained with the GETDEVICEINFO operation. A client must not assume | |
| that device IDs are valid across metadata server reboots. The device | | that device IDs are valid across metadata server reboots. The device | |
| ID is qualified by the layout type and are unique per file system | | ID is qualified by the layout type and are unique per file system | |
| (FSID). See Section 13.2.10 for more details. | | (FSID). See Section 13.2.10 for more details. | |
| | | | |
|
| 3.2.16. device_addr4 | | 3.2.17. device_addr4 | |
| | | | |
| struct device_addr4 { | | struct device_addr4 { | |
| layouttype4 da_layout_type; | | layouttype4 da_layout_type; | |
| opaque da_addr_body<>; | | opaque da_addr_body<>; | |
| }; | | }; | |
| | | | |
| The device address is used to set up a communication channel with the | | The device address is used to set up a communication channel with the | |
| storage device. Different layout types will require different types | | storage device. Different layout types will require different types | |
| of structures to define how they communicate with storage devices. | | of structures to define how they communicate with storage devices. | |
| The opaque da_addr_body field must be interpreted based on the | | The opaque da_addr_body field must be interpreted based on the | |
| specified da_layout_type field. | | specified da_layout_type field. | |
| | | | |
| This document defines the device address for the NFSv4.1 file layout | | This document defines the device address for the NFSv4.1 file layout | |
| ([[Comment.7: need xref]]), which identifies a storage device by | | ([[Comment.7: need xref]]), which identifies a storage device by | |
| network IP address and port number. This is sufficient for the | | network IP address and port number. This is sufficient for the | |
| clients to communicate with the NFSv4.1 storage devices, and may be | | clients to communicate with the NFSv4.1 storage devices, and may be | |
| sufficient for other layout types as well. Device types for object | | sufficient for other layout types as well. Device types for object | |
| storage devices and block storage devices (e.g., SCSI volume labels) | | storage devices and block storage devices (e.g., SCSI volume labels) | |
| will be defined by their respective layout specifications. | | will be defined by their respective layout specifications. | |
| | | | |
|
| 3.2.17. devlist_item4 | | 3.2.18. devlist_item4 | |
| | | | |
| struct devlist_item4 { | | struct devlist_item4 { | |
| deviceid4 dli_id; | | deviceid4 dli_id; | |
|
| device_addr4 dli_device_addr<>; | | device_addr4 dli_device_addr; | |
| }; | | }; | |
| | | | |
| An array of these values is returned by the GETDEVICELIST operation. | | An array of these values is returned by the GETDEVICELIST operation. | |
| They define the set of devices associated with a file system for the | | They define the set of devices associated with a file system for the | |
| layout type specified in the GETDEVICELIST4args. | | layout type specified in the GETDEVICELIST4args. | |
| | | | |
|
| 3.2.18. layout_content4 | | 3.2.19. layout_content4 | |
| | | | |
| struct layout_content4 { | | struct layout_content4 { | |
| layouttype4 loc_type; | | layouttype4 loc_type; | |
| opaque loc_body<>; | | opaque loc_body<>; | |
| }; | | }; | |
| | | | |
| The loc_body field must be interpreted based on the layout type | | The loc_body field must be interpreted based on the layout type | |
| (loc_type). This document defines the loc_body for the NFSv4.1 file | | (loc_type). This document defines the loc_body for the NFSv4.1 file | |
| layout type is defined; see Section 14.3 for its definition. | | layout type is defined; see Section 14.3 for its definition. | |
| | | | |
|
| 3.2.19. layout4 | | 3.2.20. layout4 | |
| | | | |
| struct layout4 { | | struct layout4 { | |
| offset4 lo_offset; | | offset4 lo_offset; | |
| length4 lo_length; | | length4 lo_length; | |
| layoutiomode4 lo_iomode; | | layoutiomode4 lo_iomode; | |
| layout_content4 lo_content; | | layout_content4 lo_content; | |
| }; | | }; | |
| | | | |
| The layout4 structure defines a layout for a file. The layout type | | The layout4 structure defines a layout for a file. The layout type | |
| specific data is opaque within lo_content. Since layouts are sub- | | specific data is opaque within lo_content. Since layouts are sub- | |
| dividable, the offset and length together with the file's filehandle, | | dividable, the offset and length together with the file's filehandle, | |
| the client ID, iomode, and layout type, identifies the layout. | | the client ID, iomode, and layout type, identifies the layout. | |
| | | | |
|
| 3.2.20. layoutupdate4 | | 3.2.21. layoutupdate4 | |
| | | | |
| struct layoutupdate4 { | | struct layoutupdate4 { | |
| layouttype4 lou_type; | | layouttype4 lou_type; | |
| opaque lou_body<>; | | opaque lou_body<>; | |
| }; | | }; | |
| | | | |
| The layoutupdate4 structure is used by the client to return 'updated' | | The layoutupdate4 structure is used by the client to return 'updated' | |
| layout information to the metadata server at LAYOUTCOMMIT time. This | | layout information to the metadata server at LAYOUTCOMMIT time. This | |
| structure provides a channel to pass layout type specific information | | structure provides a channel to pass layout type specific information | |
| (in field lou_body) back to the metadata server. E.g., for block/ | | (in field lou_body) back to the metadata server. E.g., for block/ | |
| volume layout types this could include the list of reserved blocks | | volume layout types this could include the list of reserved blocks | |
| that were written. The contents of the opaque lou_body argument are | | that were written. The contents of the opaque lou_body argument are | |
| determined by the layout type and are defined in their context. The | | determined by the layout type and are defined in their context. The | |
| NFSv4.1 file-based layout does not use this structure, thus the | | NFSv4.1 file-based layout does not use this structure, thus the | |
| lou_body field should have a zero length. | | lou_body field should have a zero length. | |
| | | | |
|
| 3.2.21. layouthint4 | | 3.2.22. layouthint4 | |
| | | | |
| struct layouthint4 { | | struct layouthint4 { | |
| layouttype4 loh_type; | | layouttype4 loh_type; | |
| opaque loh_body<>; | | opaque loh_body<>; | |
| }; | | }; | |
| The layouthint4 structure is used by the client to pass in a hint | | The layouthint4 structure is used by the client to pass in a hint | |
| about the type of layout it would like created for a particular file. | | about the type of layout it would like created for a particular file. | |
| It is the structure specified by the layout_hint attribute described | | It is the structure specified by the layout_hint attribute described | |
| in Section 5.13.4. The metadata server may ignore the hint, or may | | in Section 5.13.4. The metadata server may ignore the hint, or may | |
| selectively ignore fields within the hint. This hint should be | | selectively ignore fields within the hint. This hint should be | |
| provided at create time as part of the initial attributes within | | provided at create time as part of the initial attributes within | |
| OPEN. The loh_body field is specific to the type of layout | | OPEN. The loh_body field is specific to the type of layout | |
| (loh_type). The NFSv4.1 file-based layout uses the | | (loh_type). The NFSv4.1 file-based layout uses the | |
| nfsv4_1_file_layouthint4 structure as defined in Section 14.3. | | nfsv4_1_file_layouthint4 structure as defined in Section 14.3. | |
| | | | |
|
| 3.2.22. layoutiomode4 | | 3.2.23. layoutiomode4 | |
| | | | |
| enum layoutiomode4 { | | enum layoutiomode4 { | |
| LAYOUTIOMODE4_READ = 1, | | LAYOUTIOMODE4_READ = 1, | |
| LAYOUTIOMODE4_RW = 2, | | LAYOUTIOMODE4_RW = 2, | |
| LAYOUTIOMODE4_ANY = 3 | | LAYOUTIOMODE4_ANY = 3 | |
| }; | | }; | |
| | | | |
| The iomode specifies whether the client intends to read or write | | The iomode specifies whether the client intends to read or write | |
| (with the possibility of reading) the data represented by the layout. | | (with the possibility of reading) the data represented by the layout. | |
| The ANY iomode MUST NOT be used for LAYOUTGET, however, it can be | | The ANY iomode MUST NOT be used for LAYOUTGET, however, it can be | |
| used for LAYOUTRETURN and LAYOUTRECALL. The ANY iomode specifies | | used for LAYOUTRETURN and LAYOUTRECALL. The ANY iomode specifies | |
| that layouts pertaining to both READ and RW iomodes are being | | that layouts pertaining to both READ and RW iomodes are being | |
| returned or recalled, respectively. The metadata server's use of the | | returned or recalled, respectively. The metadata server's use of the | |
| iomode may depend on the layout type being used. The storage devices | | iomode may depend on the layout type being used. The storage devices | |
| may validate I/O accesses against the iomode and reject invalid | | may validate I/O accesses against the iomode and reject invalid | |
| accesses. | | accesses. | |
| | | | |
|
| 3.2.23. nfs_impl_id4 | | 3.2.24. nfs_impl_id4 | |
| | | | |
| struct nfs_impl_id4 { | | struct nfs_impl_id4 { | |
| utf8str_cis nii_domain; | | utf8str_cis nii_domain; | |
| utf8str_cs nii_name; | | utf8str_cs nii_name; | |
| nfstime4 nii_date; | | nfstime4 nii_date; | |
| }; | | }; | |
| | | | |
| This structure is used to identify client and server implementation | | This structure is used to identify client and server implementation | |
| detail. The nii_domain field is the DNS domain name that the | | detail. The nii_domain field is the DNS domain name that the | |
| implementer is associated with. The nii_name field is the product | | implementer is associated with. The nii_name field is the product | |
| name of the implementation and is completely free form. It is | | name of the implementation and is completely free form. It is | |
| recommended that the nii_name be used to distinguish machine | | recommended that the nii_name be used to distinguish machine | |
| architecture, machine platforms, revisions, versions, and patch | | architecture, machine platforms, revisions, versions, and patch | |
| levels. The nii_date field is the timestamp of when the software | | levels. The nii_date field is the timestamp of when the software | |
| instance was published or built. | | instance was published or built. | |
| | | | |
|
| 3.2.24. threshold_item4 | | 3.2.25. threshold_item4 | |
| | | | |
| struct threshold_item4 { | | struct threshold_item4 { | |
| layouttype4 thi_layout_type; | | layouttype4 thi_layout_type; | |
| bitmap4 thi_hintset; | | bitmap4 thi_hintset; | |
| opaque thi_hintlist<>; | | opaque thi_hintlist<>; | |
| }; | | }; | |
| | | | |
| This structure contains a list of hints specific to a layout type for | | This structure contains a list of hints specific to a layout type for | |
| helping the client determine when it should issue I/O directly | | helping the client determine when it should issue I/O directly | |
| through the metadata server vs. the data servers. The hint structure | | through the metadata server vs. the data servers. The hint structure | |
| | | | |
| skipping to change at page 83, line 45 | | skipping to change at page 84, line 45 | |
| | threshold4_read_iosize | 2 | length4 | For read I/O sizes below | | | | threshold4_read_iosize | 2 | length4 | For read I/O sizes below | | |
| | | | | this threshold it is | | | | | | | this threshold it is | | |
| | | | | recommended to read data | | | | | | | recommended to read data | | |
| | | | | through the MDS | | | | | | | through the MDS | | |
| | threshold4_write_iosize | 3 | length4 | For write I/O sizes below | | | | threshold4_write_iosize | 3 | length4 | For write I/O sizes below | | |
| | | | | this threshold it is | | | | | | | this threshold it is | | |
| | | | | recommended to write data | | | | | | | recommended to write data | | |
| | | | | through the MDS | | | | | | | through the MDS | | |
| +-------------------------+---+---------+---------------------------+ | | +-------------------------+---+---------+---------------------------+ | |
| | | | |
|
| 3.2.25. mdsthreshold4 | | 3.2.26. mdsthreshold4 | |
| | | | |
| struct mdsthreshold4 { | | struct mdsthreshold4 { | |
| threshold_item4 mth_hints<>; | | threshold_item4 mth_hints<>; | |
| }; | | }; | |
| | | | |
| This structure holds an array of threshold_item4 structures each of | | This structure holds an array of threshold_item4 structures each of | |
| which is valid for a particular layout type. An array is necessary | | which is valid for a particular layout type. An array is necessary | |
| since a server can support multiple layout types for a single file. | | since a server can support multiple layout types for a single file. | |
| | | | |
| 4. Filehandles | | 4. Filehandles | |
| | | | |
| skipping to change at page 92, line 27 | | skipping to change at page 93, line 27 | |
| | | | |
| lease_time | | lease_time | |
| | | | |
| o The per file system attributes are: | | o The per file system attributes are: | |
| | | | |
| supp_attr, fh_expire_type, link_support, symlink_support, | | supp_attr, fh_expire_type, link_support, symlink_support, | |
| unique_handles, aclsupport, cansettime, case_insensitive, | | unique_handles, aclsupport, cansettime, case_insensitive, | |
| case_preserving, chown_restricted, files_avail, files_free, | | case_preserving, chown_restricted, files_avail, files_free, | |
| files_total, fs_locations, homogeneous, maxfilesize, maxname, | | files_total, fs_locations, homogeneous, maxfilesize, maxname, | |
| maxread, maxwrite, no_trunc, space_avail, space_free, | | maxread, maxwrite, no_trunc, space_avail, space_free, | |
|
| space_total, time_delta, fs_status, fs_layout_type, | | space_total, time_delta, change_policy, fs_status, | |
| fs_locations_info | | fs_layout_type, fs_locations_info | |
| | | | |
| o The per file system object attributes are: | | o The per file system object attributes are: | |
| | | | |
| type, change, size, named_attr, fsid, rdattr_error, filehandle, | | type, change, size, named_attr, fsid, rdattr_error, filehandle, | |
| ACL, archive, fileid, hidden, maxlink, mimetype, mode, | | ACL, archive, fileid, hidden, maxlink, mimetype, mode, | |
| numlinks, owner, owner_group, rawdev, space_used, system, | | numlinks, owner, owner_group, rawdev, space_used, system, | |
| time_access, time_backup, time_create, time_metadata, | | time_access, time_backup, time_create, time_metadata, | |
| time_modify, mounted_on_fileid, dir_notif_delay, | | time_modify, mounted_on_fileid, dir_notif_delay, | |
| dirent_notif_delay, dacl, sacl, layout_type, layout_hint, | | dirent_notif_delay, dacl, sacl, layout_type, layout_hint, | |
| layout_blksize, layout_alignment, mdsthreshold, retention_get, | | layout_blksize, layout_alignment, mdsthreshold, retention_get, | |
| | | | |
| skipping to change at page 95, line 41 | | skipping to change at page 97, line 4 | |
| | | | | | specified in a | | | | | | | | specified in a | | |
| | | | | | SETATTR | | | | | | | | SETATTR | | |
| | | | | | operation. | | | | | | | | operation. | | |
| | case_insensitive | 16 | bool | READ | True, if | | | | case_insensitive | 16 | bool | READ | True, if | | |
| | | | | | filename | | | | | | | | filename | | |
| | | | | | comparisons on | | | | | | | | comparisons on | | |
| | | | | | this file | | | | | | | | this file | | |
| | | | | | system are | | | | | | | | system are | | |
| | | | | | case | | | | | | | | case | | |
| | | | | | insensitive. | | | | | | | | insensitive. | | |
|
| | | | change_policy | 60 | chg_policy4 | READ | A value | | |
| | | | | | | | created by the | | |
| | | | | | | | server that | | |
| | | | | | | | the client can | | |
| | | | | | | | use to | | |
| | | | | | | | determine if | | |
| | | | | | | | some server | | |
| | | | | | | | policy related | | |
| | | | | | | | to the current | | |
| | | | | | | | filesystem has | | |
| | | | | | | | been subject | | |
| | | | | | | | to change. If | | |
| | | | | | | | the value | | |
| | | | | | | | remains the | | |
| | | | | | | | same then the | | |
| | | | | | | | client can be | | |
| | | | | | | | sure that the | | |
| | | | | | | | values of the | | |
| | | | | | | | attributes | | |
| | | | | | | | related to fs | | |
| | | | | | | | location and | | |
| | | | | | | | the | | |
| | | | | | | | fsstat_type | | |
| | | | | | | | field of the | | |
| | | | | | | | fs_status | | |
| | | | | | | | attribute have | | |
| | | | | | | | not changed. | | |
| | | | | | | | See | | |
| | | | | | | | Section 3.2.6 | | |
| | | | | | | | for details. | | |
| | case_preserving | 17 | bool | READ | True, if | | | | case_preserving | 17 | bool | READ | True, if | | |
| | | | | | filename case | | | | | | | | filename case | | |
| | | | | | on this file | | | | | | | | on this file | | |
| | | | | | system are | | | | | | | | system are | | |
| | | | | | preserved. | | | | | | | | preserved. | | |
| | chown_restricted | 18 | bool | READ | If TRUE, the | | | | chown_restricted | 18 | bool | READ | If TRUE, the | | |
| | | | | | server will | | | | | | | | server will | | |
| | | | | | reject any | | | | | | | | reject any | | |
| | | | | | request to | | | | | | | | request to | | |
| | | | | | change either | | | | | | | | change either | | |
| | | | |
| skipping to change at page 96, line 35 | | skipping to change at page 98, line 35 | |
| | dacl | 58 | nfsacl41 | R/W | Access Control | | | | dacl | 58 | nfsacl41 | R/W | Access Control | | |
| | | | | | List used for | | | | | | | | List used for | | |
| | | | | | determining | | | | | | | | determining | | |
| | | | | | access to file | | | | | | | | access to file | | |
| | | | | | system | | | | | | | | system | | |
| | | | | | objects. | | | | | | | | objects. | | |
| | dir_notif_delay | 56 | nfstime4 | READ | notification | | | | dir_notif_delay | 56 | nfstime4 | READ | notification | | |
| | | | | | delays on | | | | | | | | delays on | | |
| | | | | | directory | | | | | | | | directory | | |
| | | | | | attributes | | | | | | | | attributes | | |
|
| | dirent_ | 57 | nfstime4 | READ | notification | | | | dirent_notif_dela | 57 | nfstime4 | READ | notification | | |
| | notif_delay | | | | delays on | | | | y | | | | delays on | | |
| | | | | | child | | | | | | | | child | | |
| | | | | | attributes | | | | | | | | attributes | | |
| | fileid | 20 | uint64 | READ | A number | | | | fileid | 20 | uint64 | READ | A number | | |
| | | | | | uniquely | | | | | | | | uniquely | | |
| | | | | | identifying | | | | | | | | identifying | | |
| | | | | | the file | | | | | | | | the file | | |
| | | | | | within the | | | | | | | | within the | | |
| | | | | | file system. | | | | | | | | file system. | | |
| | files_avail | 21 | uint64 | READ | File slots | | | | files_avail | 21 | uint64 | READ | File slots | | |
| | | | | | available to | | | | | | | | available to | | |
| | | | |
| skipping to change at page 97, line 29 | | skipping to change at page 99, line 29 | |
| | | | | | this object - | | | | | | | | this object - | | |
| | | | | | this should be | | | | | | | | this should be | | |
| | | | | | the smallest | | | | | | | | the smallest | | |
| | | | | | relevant | | | | | | | | relevant | | |
| | | | | | limit. | | | | | | | | limit. | | |
| | files_total | 23 | uint64 | READ | Total file | | | | files_total | 23 | uint64 | READ | Total file | | |
| | | | | | slots on the | | | | | | | | slots on the | | |
| | | | | | file system | | | | | | | | file system | | |
| | | | | | containing | | | | | | | | containing | | |
| | | | | | this object. | | | | | | | | this object. | | |
|
| | fs_absent | 60 | bool | READ | Is current | | | | |
| | | | | | file system | | | | |
| | | | | | present or | | | | |
| | | | | | absent. | | | | |
| | fs_layout_type | 62 | layouttype4<> | READ | Layout types | | | | fs_layout_type | 62 | layouttype4<> | READ | Layout types | | |
| | | | | | available for | | | | | | | | available for | | |
| | | | | | the file | | | | | | | | the file | | |
| | | | | | system. | | | | | | | | system. | | |
| | fs_locations | 24 | fs_locations | READ | Locations | | | | fs_locations | 24 | fs_locations | READ | Locations | | |
| | | | | | where this | | | | | | | | where this | | |
| | | | | | file system | | | | | | | | file system | | |
| | | | | | may be found. | | | | | | | | may be found. | | |
| | | | | | If the server | | | | | | | | If the server | | |
| | | | | | returns | | | | | | | | returns | | |
| | | | |
| skipping to change at page 109, line 38 | | skipping to change at page 111, line 38 | |
| | | | |
| The dirent_notif_delay attribute is the minimum number of seconds the | | The dirent_notif_delay attribute is the minimum number of seconds the | |
| server will delay before notifying the client of a change to a file | | server will delay before notifying the client of a change to a file | |
| object that has an entry in the directory. | | object that has an entry in the directory. | |
| | | | |
| 5.13. PNFS Attributes | | 5.13. PNFS Attributes | |
| | | | |
| 5.13.1. fs_layout_type | | 5.13.1. fs_layout_type | |
| | | | |
| The fs_layout_type attribute (data type layouttype4, see | | The fs_layout_type attribute (data type layouttype4, see | |
|
| Section 3.2.14) applies to a file system and indicates what layout | | Section 3.2.15) applies to a file system and indicates what layout | |
| types are supported by the file system. This attribute is expected | | types are supported by the file system. This attribute is expected | |
| be queried when a client encounters a new fsid. This attribute is | | be queried when a client encounters a new fsid. This attribute is | |
| used by the client to determine if it supports the layout type. | | used by the client to determine if it supports the layout type. | |
| | | | |
| 5.13.2. layout_alignment | | 5.13.2. layout_alignment | |
| | | | |
| The layout_alignment attribute indicates the preferred alignment for | | The layout_alignment attribute indicates the preferred alignment for | |
| I/O to files on the file system the client has layouts for. Where | | I/O to files on the file system the client has layouts for. Where | |
| possible, the client should issue READ and WRITE operations with | | possible, the client should issue READ and WRITE operations with | |
| offsets are whole multiples of the layout_alignment attribute. | | offsets are whole multiples of the layout_alignment attribute. | |
| | | | |
| skipping to change at page 110, line 16 | | skipping to change at page 112, line 16 | |
| | | | |
| The layout_blksize attribute indicates the preferred block size for | | The layout_blksize attribute indicates the preferred block size for | |
| I/O to files on the file system the client has layouts for. Where | | I/O to files on the file system the client has layouts for. Where | |
| possible, the client should issue READ operations with a count | | possible, the client should issue READ operations with a count | |
| argument that is a whole multiple of layout_blksize, and WRITE | | argument that is a whole multiple of layout_blksize, and WRITE | |
| operations with a data argument of size that is a whole multiple of | | operations with a data argument of size that is a whole multiple of | |
| layout_blksize. | | layout_blksize. | |
| | | | |
| 5.13.4. layout_hint | | 5.13.4. layout_hint | |
| | | | |
|
| The layout_hint attribute (data type layouthint4, see Section 3.2.21) | | The layout_hint attribute (data type layouthint4, see Section 3.2.22) | |
| may be set on newly created files to influence the metadata server's | | may be set on newly created files to influence the metadata server's | |
| choice for the file's layout. It is suggested that this attribute is | | choice for the file's layout. It is suggested that this attribute is | |
| set as one of the initial attributes within the OPEN call. The | | set as one of the initial attributes within the OPEN call. The | |
| metadata server may ignore this attribute. This attribute is a sub- | | metadata server may ignore this attribute. This attribute is a sub- | |
| set of the layout structure returned by LAYOUTGET. For example, | | set of the layout structure returned by LAYOUTGET. For example, | |
| instead of specifying particular devices, this would be used to | | instead of specifying particular devices, this would be used to | |
| suggest the stripe width of a file. It is up to the server | | suggest the stripe width of a file. It is up to the server | |
| implementation to determine which fields within the layout it uses. | | implementation to determine which fields within the layout it uses. | |
| | | | |
| 5.13.5. layout_type | | 5.13.5. layout_type | |
| | | | |
| skipping to change at page 138, line 29 | | skipping to change at page 140, line 29 | |
| | | | |
| 7.1. Server Exports | | 7.1. Server Exports | |
| | | | |
| On a UNIX server, the namespace describes all the files reachable by | | On a UNIX server, the namespace describes all the files reachable by | |
| pathnames under the root directory or "/". On a Windows NT server | | pathnames under the root directory or "/". On a Windows NT server | |
| the namespace constitutes all the files on disks named by mapped disk | | the namespace constitutes all the files on disks named by mapped disk | |
| letters. NFS server administrators rarely make the entire server's | | letters. NFS server administrators rarely make the entire server's | |
| file system namespace available to NFS clients. More often portions | | file system namespace available to NFS clients. More often portions | |
| of the namespace are made available via an "export" feature. In | | of the namespace are made available via an "export" feature. In | |
| previous versions of the NFS protocol, the root filehandle for each | | previous versions of the NFS protocol, the root filehandle for each | |
|
| export is obtained through the MOUNT protocol; the client sends a | | export is obtained through the MOUNT protocol; the client sent a | |
| string that identifies the export of namespace and the server returns | | string that identified the export name within the namespace and the | |
| the root filehandle for it. The MOUNT protocol supports an EXPORTS | | server returned the root filehandle for that export. The MOUNT | |
| procedure that will enumerate the server's exports. | | protocol also provided an EXPORTS procedure that enumerated server's | |
| | | exports. | |
| | | | |
| 7.2. Browsing Exports | | 7.2. Browsing Exports | |
| | | | |
| The NFS version 4 protocol provides a root filehandle that clients | | The NFS version 4 protocol provides a root filehandle that clients | |
| can use to obtain filehandles for the exports of a particular server, | | can use to obtain filehandles for the exports of a particular server, | |
| via a series of LOOKUP operations within a COMPOUND, to traverse a | | via a series of LOOKUP operations within a COMPOUND, to traverse a | |
| path. A common user experience is to use a graphical user interface | | path. A common user experience is to use a graphical user interface | |
| (perhaps a file "Open" dialog window) to find a file via progressive | | (perhaps a file "Open" dialog window) to find a file via progressive | |
| browsing through a directory tree. The client must be able to move | | browsing through a directory tree. The client must be able to move | |
| from one export to another export via single-component, progressive | | from one export to another export via single-component, progressive | |
| LOOKUP operations. | | LOOKUP operations. | |
| | | | |
| This style of browsing is not well supported by the NFS version 2 and | | This style of browsing is not well supported by the NFS version 2 and | |
|
| 3 protocols. The client expects all LOOKUP operations to remain | | 3 protocols. In these versions of NFS, the client expects all LOOKUP | |
| within a single server file system. For example, the device | | operations to remain within a single server file system. For | |
| attribute will not change. This prevents a client from taking | | example, the device attribute will not change. This prevents a | |
| namespace paths that span exports. | | client from taking namespace paths that span exports. | |
| | | | |
|
| In the case of Veriosn 2 and 3, an automounter on the client can | | In the case of Versions 2 and 3, an automounter on the client can | |
| obtain a snapshot of the server's namespace using the EXPORTS | | obtain a snapshot of the server's namespace using the EXPORTS | |
| procedure of the MOUNT protocol. If it understands the server's | | procedure of the MOUNT protocol. If it understands the server's | |
| pathname syntax, it can create an image of the server's namespace on | | pathname syntax, it can create an image of the server's namespace on | |
| the client. The parts of the namespace that are not exported by the | | the client. The parts of the namespace that are not exported by the | |
| server are filled in with directories that might be arrange similarly | | server are filled in with directories that might be arrange similarly | |
| to a version 4 "pseudo file system" that allows the user to browse | | to a version 4 "pseudo file system" that allows the user to browse | |
| from one mounted file system to another. There is a drawback to this | | from one mounted file system to another. There is a drawback to this | |
| representation of the server's namespace on the client: it is static. | | representation of the server's namespace on the client: it is static. | |
| If the server administrator adds a new export the client will be | | If the server administrator adds a new export the client will be | |
| unaware of it. | | unaware of it. | |
| | | | |
| skipping to change at page 139, line 31 | | skipping to change at page 141, line 31 | |
| a single namespace, for that server. An NFS version 4 client uses | | a single namespace, for that server. An NFS version 4 client uses | |
| LOOKUP and READDIR operations to browse seamlessly from one export to | | LOOKUP and READDIR operations to browse seamlessly from one export to | |
| another. | | another. | |
| | | | |
| Where there are portions of the server namespace that are not | | Where there are portions of the server namespace that are not | |
| exported, clients require some way of traversing those portions to | | exported, clients require some way of traversing those portions to | |
| reach actual exported file systems. A technique that servers may use | | reach actual exported file systems. A technique that servers may use | |
| to provide for this is to bridge unexported portion of the namespace | | to provide for this is to bridge unexported portion of the namespace | |
| via a "pseudo file system" that provides a view of exported | | via a "pseudo file system" that provides a view of exported | |
| directories only. A pseudo file system has a unique fsid and behaves | | directories only. A pseudo file system has a unique fsid and behaves | |
|
| like a normal, read only file system. | | like a normal, read-only file system. | |
| | | | |
| Based on the construction of the server's namespace, it is possible | | Based on the construction of the server's namespace, it is possible | |
| that multiple pseudo file systems may exist. For example, | | that multiple pseudo file systems may exist. For example, | |
| | | | |
| /a pseudo file system | | /a pseudo file system | |
| /a/b real file system | | /a/b real file system | |
| /a/b/c pseudo file system | | /a/b/c pseudo file system | |
| /a/b/c/d real file system | | /a/b/c/d real file system | |
| | | | |
|
| Each of the pseudo file systems are considered separate entities and | | Each of the pseudo file systems is considered a separate entity and | |
| therefore MUST have its own unique fsid. | | therefore MUST have its own fsid, unique among all the fsids for that | |
| | | server. | |
| | | | |
| 7.4. Multiple Roots | | 7.4. Multiple Roots | |
| | | | |
| Certain operating environments are sometimes described as having | | Certain operating environments are sometimes described as having | |
| "multiple roots". In such environments individual file systems are | | "multiple roots". In such environments individual file systems are | |
| commonly represented by disk or volume names. NFS version 4 servers | | commonly represented by disk or volume names. NFS version 4 servers | |
| for these platforms can construct a pseudo file system above these | | for these platforms can construct a pseudo file system above these | |
| root names so that disk letters or volume names are simply directory | | root names so that disk letters or volume names are simply directory | |
| names in the pseudo root. | | names in the pseudo root. | |
| | | | |
| | | | |
| skipping to change at page 140, line 22 | | skipping to change at page 142, line 22 | |
| which persistent filehandles could be constructed. Even though it is | | which persistent filehandles could be constructed. Even though it is | |
| preferable that the server provide persistent filehandles for the | | preferable that the server provide persistent filehandles for the | |
| pseudo file system, the NFS client should expect that pseudo file | | pseudo file system, the NFS client should expect that pseudo file | |
| system filehandles are volatile. This can be confirmed by checking | | system filehandles are volatile. This can be confirmed by checking | |
| the associated "fh_expire_type" attribute for those filehandles in | | the associated "fh_expire_type" attribute for those filehandles in | |
| question. If the filehandles are volatile, the NFS client must be | | question. If the filehandles are volatile, the NFS client must be | |
| prepared to recover a filehandle value (e.g. with a series of LOOKUP | | prepared to recover a filehandle value (e.g. with a series of LOOKUP | |
| operations) when receiving an error of NFS4ERR_FHEXPIRED. | | operations) when receiving an error of NFS4ERR_FHEXPIRED. | |
| | | | |
| Because it is quite likely that servers will implement pseudo file | | Because it is quite likely that servers will implement pseudo file | |
|
| systems using volative filehandles, clients need to be prepared for | | systems using volatile filehandles, clients need to be prepared for | |
| them, rather than assuming that all filehandles will be persistent. | | them, rather than assuming that all filehandles will be persistent. | |
| | | | |
| 7.6. Exported Root | | 7.6. Exported Root | |
| | | | |
| If the server's root file system is exported, one might conclude that | | If the server's root file system is exported, one might conclude that | |
|
| a pseudo-file system is unneeded. This not necessarily so. Assume | | a pseudo file system is unneeded. This not necessarily so. Assume | |
| the following file systems on a server: | | the following file systems on a server: | |
| | | | |
| / fs1 (exported) | | / fs1 (exported) | |
| /a fs2 (not exported) | | /a fs2 (not exported) | |
| /a/b fs3 (exported) | | /a/b fs3 (exported) | |
| | | | |
| Because fs2 is not exported, fs3 cannot be reached with simple | | Because fs2 is not exported, fs3 cannot be reached with simple | |
|
| LOOKUPs. The server must bridge the gap with a pseudo-file system. | | LOOKUPs. The server must bridge the gap with a pseudo file system. | |
| | | | |
| 7.7. Mount Point Crossing | | 7.7. Mount Point Crossing | |
| | | | |
| The server file system environment may be constructed in such a way | | The server file system environment may be constructed in such a way | |
| that one file system contains a directory which is 'covered' or | | that one file system contains a directory which is 'covered' or | |
| mounted upon by a second file system. For example: | | mounted upon by a second file system. For example: | |
| | | | |
| /a/b (file system 1) | | /a/b (file system 1) | |
| /a/b/c/d (file system 2) | | /a/b/c/d (file system 2) | |
| | | | |
| The pseudo file system for this server may be constructed to look | | The pseudo file system for this server may be constructed to look | |
| like: | | like: | |
| | | | |
| / (place holder/not exported) | | / (place holder/not exported) | |
| /a/b (file system 1) | | /a/b (file system 1) | |
| /a/b/c/d (file system 2) | | /a/b/c/d (file system 2) | |
| It is the server's responsibility to present the pseudo file system | | It is the server's responsibility to present the pseudo file system | |
| that is complete to the client. If the client sends a lookup request | | that is complete to the client. If the client sends a lookup request | |
| for the path "/a/b/c/d", the server's response is the filehandle of | | for the path "/a/b/c/d", the server's response is the filehandle of | |
|
| the file system "/a/b/c/d". In previous versions of the NFS | | the root of the file system "/a/b/c/d". In previous versions of the | |
| protocol, the server would respond with the filehandle of directory | | NFS protocol, the server would respond with the filehandle of | |
| "/a/b/c/d" within the file system "/a/b". | | directory "/a/b/c/d" within the file system "/a/b". | |
| | | | |
| The NFS client will be able to determine if it crosses a server mount | | The NFS client will be able to determine if it crosses a server mount | |
| point by a change in the value of the "fsid" attribute. | | point by a change in the value of the "fsid" attribute. | |
| | | | |
| 7.8. Security Policy and Namespace Presentation | | 7.8. Security Policy and Namespace Presentation | |
| | | | |
| Because NFSv4 clients possess the ability to change the security | | Because NFSv4 clients possess the ability to change the security | |
| mechanisms used, after determining what is allowed, by using SECINFO | | mechanisms used, after determining what is allowed, by using SECINFO | |
| and SECINFO_NONAME, the server SHOULD NOT present a different view of | | and SECINFO_NONAME, the server SHOULD NOT present a different view of | |
| the namespace based on the security mechanism being used by a client. | | the namespace based on the security mechanism being used by a client. | |
| Instead, it should present a consistent view and return | | Instead, it should present a consistent view and return | |
| NFS4ERR_WRONGSEC if an attempt is made to access data with an | | NFS4ERR_WRONGSEC if an attempt is made to access data with an | |
| inappropriate security mechanism. | | inappropriate security mechanism. | |
| | | | |
| If security considerations make it necessary to hide the existence of | | If security considerations make it necessary to hide the existence of | |
| a particular file system, as opposed to all of the data within it, | | a particular file system, as opposed to all of the data within it, | |
| the server can apply the security policy of a shared resource in the | | the server can apply the security policy of a shared resource in the | |
| server's namespace to components of the resource's ancestors. For | | server's namespace to components of the resource's ancestors. For | |
| example: | | example: | |
| | | | |
|
| / | | / (place holder/not exported) | |
| /a/b | | /a/b (file system 1) | |
| /a/b/MySecretProject | | /a/b/MySecretProject (file system 2) | |
| | | | |
| The /a/b/MySecretProject directory is a real file system and is the | | The /a/b/MySecretProject directory is a real file system and is the | |
| shared resource. Suppose the security policy for /a/b/ | | shared resource. Suppose the security policy for /a/b/ | |
| MySecretProject is Kerberos with integrity and it desired to prevent | | MySecretProject is Kerberos with integrity and it desired to prevent | |
| knowledge of the existence of this file system to be very limited. | | knowledge of the existence of this file system to be very limited. | |
| In this case the server should apply the same security policy to | | In this case the server should apply the same security policy to | |
| /a/b. This allows for knowledge the existence of a filesystem to be | | /a/b. This allows for knowledge the existence of a filesystem to be | |
| secured in cases where this is desirable. | | secured in cases where this is desirable. | |
| | | | |
| For the case of the use of multiple, disjoint security mechanisms in | | For the case of the use of multiple, disjoint security mechanisms in | |
|
| the server's resources, the security for a particular object in the | | the server's resources, applying that sort of policy would result in | |
| server's namespace should be the union of all security mechanisms of | | the higher-level file system not being accessible using any security | |
| all direct descendants. A common and convenient practice, unless | | flavor, which would make the that higher-level file system | |
| strong security requirements dictate otherwise, is to make all of the | | inaccessible. Therefore, that sort of configuration is not | |
| pseudo file system accessible by all of the valid security | | compatible with hiding the existence (as opposed to the contents) | |
| mechanisms. | | from clients using multiple disjoint sets of security flavors. | |
| | | | |
| | | In other circumstances, a desirable policy is for the security of a | |
| | | particular object in the server's namespace should include the union | |
| | | of all security mechanisms of all direct descendants. A common and | |
| | | convenient practice, unless strong security requirements dictate | |
| | | otherwise, is to make all of the pseudo file system accessible by all | |
| | | of the valid security mechanisms. | |
| | | | |
| Where there is concern about the security of data on the wire, | | Where there is concern about the security of data on the wire, | |
| clients should use strong security mechanisms to access the pseudo | | clients should use strong security mechanisms to access the pseudo | |
| file system in order to prevent man-in-the-middle-attacks from | | file system in order to prevent man-in-the-middle-attacks from | |
|
| directing LOOKUP's within the pseudo-fs from compromising the | | directing LOOKUPs within the pseudo file system from compromising the | |
| existence of sensitive data, or getting access to data that the | | existence of sensitive data, or getting access to data that the | |
| client is sending by directing the client to send it using weak | | client is sending by directing the client to send it using weak | |
| security mechanisms. | | security mechanisms. | |
| | | | |
| 8. State Management | | 8. State Management | |
| | | | |
| Integrating locking into the NFS protocol necessarily causes it to be | | Integrating locking into the NFS protocol necessarily causes it to be | |
| stateful. With the inclusion of such features as share reservations, | | stateful. With the inclusion of such features as share reservations, | |
| file and directory delegations, recallable layouts, and support for | | file and directory delegations, recallable layouts, and support for | |
| mandatory record locking the protocol becomes substantially more | | mandatory record locking the protocol becomes substantially more | |
| | | | |
| skipping to change at page 144, line 37 | | skipping to change at page 146, line 46 | |
| | | | |
| Stateids are divided into two fields, a 96-bit "other" field | | Stateids are divided into two fields, a 96-bit "other" field | |
| identifying the specific set of locks and a 32-bit "seqid" sequence | | identifying the specific set of locks and a 32-bit "seqid" sequence | |
| value. Except in the case of special stateids, to be discussed | | value. Except in the case of special stateids, to be discussed | |
| below, a particular value of the "other" field denotes a set of locks | | below, a particular value of the "other" field denotes a set of locks | |
| of the same type (for example byte-range lock, opens, delegations, or | | of the same type (for example byte-range lock, opens, delegations, or | |
| layouts), for a specific file or directory, and sharing the same | | layouts), for a specific file or directory, and sharing the same | |
| ownership characteristics. The seqid designates a specific instance | | ownership characteristics. The seqid designates a specific instance | |
| of such a set of locks, and is incremented to indicate changes in | | of such a set of locks, and is incremented to indicate changes in | |
| such a set of locks, either by the addition or deletion of locks from | | such a set of locks, either by the addition or deletion of locks from | |
|
| the, a change in the byte-range they apply to, or an upgrade or | | the set, a change in the byte-range they apply to, or an upgrade or | |
| downgrade in the type of one or more locks. | | downgrade in the type of one or more locks. | |
| | | | |
| When such a set of locks is first created the server returns a | | When such a set of locks is first created the server returns a | |
| stateid with seqid value of one. On subsequent operations which | | stateid with seqid value of one. On subsequent operations which | |
| modify the set of locks the server is required to increment the seqid | | modify the set of locks the server is required to increment the seqid | |
| field by one (1) whenever it returns a stateid for the same state | | field by one (1) whenever it returns a stateid for the same state | |
| owner/file/type combination and there is some change in the set of | | owner/file/type combination and there is some change in the set of | |
| locks actually designated. In this case the server will return a | | locks actually designated. In this case the server will return a | |
| stateid with an other field the same as previously used for that | | stateid with an other field the same as previously used for that | |
| state owner/file/type combination, with an incremented seqid field. | | state owner/file/type combination, with an incremented seqid field. | |
| | | | |
| The purpose of the incrementing of the seqid is to allow the replier | | The purpose of the incrementing of the seqid is to allow the replier | |
| to communicate to the requester the order in which operations that | | to communicate to the requester the order in which operations that | |
| modified locking state associated with a stateid have been processed | | modified locking state associated with a stateid have been processed | |
| and to make it possible for the client to issue requests that are | | and to make it possible for the client to issue requests that are | |
| conditional on the set of locks not having changed since the stateid | | conditional on the set of locks not having changed since the stateid | |
| in question was returned. | | in question was returned. | |
| | | | |
|
| When stateids are sent to the server by the client, it has two | | When a client sends a stateid to the server, it has two choices with | |
| choices with regard to the seqid sent. It may set the seqid to zero | | regard to the seqid sent. It may set the seqid to zero to indicate | |
| to indicate to the server that it wishes the most up-to-date seqid | | to the server that it wishes the most up-to-date seqid for that | |
| for that stateid's "other" field to be used. This would be the | | stateid's "other" field to be used. This would be the common choice | |
| common choice in the case of stateid sent with a READ or WRITE | | in the case of a stateid sent with a READ or WRITE operation. It | |
| operation. It also may set a non-zero value in which case the server | | also may set a non-zero value in which case the server checks if that | |
| checks if that seqid is the correct one. In that case the server is | | seqid is the correct one. In that case the server is required to | |
| required to return NFS4ERR_OLD_STATEID if the seqid is lower than the | | return NFS4ERR_OLD_STATEID if the seqid is lower than the most | |
| most current value and NFS4ERR_BAD_STATEID if the seqid is greater | | current value and NFS4ERR_BAD_STATEID if the seqid is greater than | |
| than the most current value. This would be the common choice in the | | the most current value. This would be the common choice in the case | |
| case if stateids sent with a CLOSE or OPEN_DOWNGRADE. Because OPENs | | of stateids sent with a CLOSE or OPEN_DOWNGRADE. Because OPENs may | |
| may be sent in parallel for the same owner, a client might close a | | be sent in parallel for the same owner, a client might close a file | |
| file without knowing that an OPEN upgrade had been done by the | | without knowing that an OPEN upgrade had been done by the server, | |
| server, changing the lock in question. If CLOSE were sent with a | | changing the lock in question. If CLOSE were sent with a zero seqid, | |
| zero seqid, the OPEN upgrade would be canceled before the client even | | the OPEN upgrade would be canceled before the client even received an | |
| received an indication that it had happened. | | indication that an upgrade had happened. | |
| | | | |
| 8.2.3. Special Stateids | | 8.2.3. Special Stateids | |
| | | | |
| Stateid values whose "other" field is either all zeros or all ones | | Stateid values whose "other" field is either all zeros or all ones | |
| are reserved. They may not be assigned by the server but have | | are reserved. They may not be assigned by the server but have | |
| special meanings defined by the protocol. The particular meaning | | special meanings defined by the protocol. The particular meaning | |
| depends on whether the "other" field is all zeros or all ones and the | | depends on whether the "other" field is all zeros or all ones and the | |
| specific value of the "seqid" field. | | specific value of the "seqid" field. | |
| | | | |
| The following combinations of "other" and "seqid" are defined in | | The following combinations of "other" and "seqid" are defined in | |
| | | | |
| skipping to change at page 147, line 37 | | skipping to change at page 149, line 44 | |
| | | | |
| o If the server has restarted resulting in loss of all leased state | | o If the server has restarted resulting in loss of all leased state | |
| but the sessionid and client Id are still valid, return | | but the sessionid and client Id are still valid, return | |
| NFS4ERR_STALE_STATEID. (If server restart has resulted in an | | NFS4ERR_STALE_STATEID. (If server restart has resulted in an | |
| invalid client ID or sessionid is invalid, SEQUENCE will return an | | invalid client ID or sessionid is invalid, SEQUENCE will return an | |
| error - not NFS4ERR_STALE_STATEID - and the operation that takes a | | error - not NFS4ERR_STALE_STATEID - and the operation that takes a | |
| stateid as an argument will never be processed.) | | stateid as an argument will never be processed.) | |
| | | | |
| o If the "other" field is all zeros or all ones, check that the | | o If the "other" field is all zeros or all ones, check that the | |
| "other" and "seqid" match a defined combination for a special | | "other" and "seqid" match a defined combination for a special | |
|
| stateid and than that stateid can be used in the current context. | | stateid and then that stateid can be used in the current context. | |
| If not, then return NFS4ERR_BAD_STATEID. | | If not, then return NFS4ERR_BAD_STATEID. | |
| | | | |
| o If the "seqid" field is not zero, and it is greater than the | | o If the "seqid" field is not zero, and it is greater than the | |
| current sequence value corresponding the current "other" field, | | current sequence value corresponding the current "other" field, | |
| return NFS4ERR_BAD_STATEID. | | return NFS4ERR_BAD_STATEID. | |
| | | | |
| o If the "seqid" field is not zero, and it is less than the current | | o If the "seqid" field is not zero, and it is less than the current | |
| sequence value corresponding the current "other" field, return | | sequence value corresponding the current "other" field, return | |
| NFS4ERR_OLD_STATEID. | | NFS4ERR_OLD_STATEID. | |
| | | | |
| | | | |
| skipping to change at page 148, line 50 | | skipping to change at page 151, line 11 | |
| otherwise unreachable. It is not a mechanism for cache consistency | | otherwise unreachable. It is not a mechanism for cache consistency | |
| and lease renewals may not be denied if the lease interval has not | | and lease renewals may not be denied if the lease interval has not | |
| expired. | | expired. | |
| | | | |
| Since each session is associated with a specific client, any | | Since each session is associated with a specific client, any | |
| operation issued on that session is an indication that the associated | | operation issued on that session is an indication that the associated | |
| client is reachable. When a request is issued for a given session, | | client is reachable. When a request is issued for a given session, | |
| successful execution of a SEQUENCE operation (or successful retrieval | | successful execution of a SEQUENCE operation (or successful retrieval | |
| of the result of SEQUENCE from the reply cache) will result in all | | of the result of SEQUENCE from the reply cache) will result in all | |
| leases for the associated client to be implicitly renewed. In | | leases for the associated client to be implicitly renewed. In | |
|
| addition, whenever a new stateid is created ot updated (i.e. returned | | addition, whenever a new stateid is created or updated (i.e. returned | |
| with a new seqid value), all leases for the associate client are also | | with a new seqid value), all leases for the associate client are also | |
| renewed. This approach allows for low overhead lease renewal which | | renewed. This approach allows for low overhead lease renewal which | |
| scales well. In the typical case no extra RPC calls are required for | | scales well. In the typical case no extra RPC calls are required for | |
| lease renewal and in the worst case one RPC is required every lease | | lease renewal and in the worst case one RPC is required every lease | |
| period, via a COMPOUND that consists solely of a single SEQUENCE | | period, via a COMPOUND that consists solely of a single SEQUENCE | |
| operation. The number of locks held by the client is not a factor | | operation. The number of locks held by the client is not a factor | |
| since all state for the client is involved with the lease renewal | | since all state for the client is involved with the lease renewal | |
| action. | | action. | |
| | | | |
| Since all operations that create a new lease also renew existing | | Since all operations that create a new lease also renew existing | |
| leases, the server must maintain a common lease expiration time for | | leases, the server must maintain a common lease expiration time for | |
| all valid leases for a given client. This lease time can then be | | all valid leases for a given client. This lease time can then be | |
| easily updated upon implicit lease renewal actions. | | easily updated upon implicit lease renewal actions. | |
| | | | |
| 8.4. Crash Recovery | | 8.4. Crash Recovery | |
| | | | |
|
| The important requirement in crash recovery is that both the client | | A critical requirement in crash recovery is that both the client and | |
| and the server know when the other has failed. Additionally, it is | | the server know when the other has failed. Additionally, it is | |
| required that a client sees a consistent view of data across server | | required that a client sees a consistent view of data across server | |
| restarts or reboots. All READ and WRITE operations that may have | | restarts or reboots. All READ and WRITE operations that may have | |
| been queued within the client or network buffers must wait until the | | been queued within the client or network buffers must wait until the | |
| client has successfully recovered the locks protecting the READ and | | client has successfully recovered the locks protecting the READ and | |
|
| WRITE operations. Any that reach the server before it can safely | | WRITE operations. Any that reach the server before the server can | |
| determine that it has re-established enough locking state to be sure | | safely determine that the client has recovered enough locking state | |
| that such requests can be safely processed must be rejected, either | | to be sure that such operations can be safely processed must be | |
| because the state presented is no longer valid | | rejected, either because the state presented is no longer valid | |
| (NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID) or because | | (NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID) or because | |
| subsequent recovery of locks may make execution of the operation | | subsequent recovery of locks may make execution of the operation | |
| inappropriate (NFS4ERR_GRACE). | | inappropriate (NFS4ERR_GRACE). | |
| | | | |
| 8.4.1. Client Failure and Recovery | | 8.4.1. Client Failure and Recovery | |
| | | | |
| In the event that a client fails, the server may release the client's | | In the event that a client fails, the server may release the client's | |
| locks when the associated leases have expired. Conflicting locks | | locks when the associated leases have expired. Conflicting locks | |
| from another client may only be granted after this lease expiration. | | from another client may only be granted after this lease expiration. | |
| When a client has not failed and re-establishes his lease before | | When a client has not failed and re-establishes his lease before | |
| | | | |
| skipping to change at page 151, line 29 | | skipping to change at page 153, line 38 | |
| example, CREATE_SESSION, DESTROY_SESSION) returns | | example, CREATE_SESSION, DESTROY_SESSION) returns | |
| NFS4ERR_STALE_CLIENTID. The client MUST establish a new client | | NFS4ERR_STALE_CLIENTID. The client MUST establish a new client | |
| ID (Section 8.1) and re-establish its lock state | | ID (Section 8.1) and re-establish its lock state | |
| (Section 8.4.2.1). | | (Section 8.4.2.1). | |
| | | | |
| 8.4.2.1. State Reclaim | | 8.4.2.1. State Reclaim | |
| | | | |
| When state information and the associated locks are lost as a result | | When state information and the associated locks are lost as a result | |
| of a server reboot, the protocol must provide a way to cause that | | of a server reboot, the protocol must provide a way to cause that | |
| state to be re-established. The approach used is to define, for most | | state to be re-established. The approach used is to define, for most | |
|
| type of locking state (layouts are an exception, a request whose | | type of locking state (layouts are an exception), a request whose | |
| function is to allow the client to re-establish on the server a lock | | function is to allow the client to re-establish on the server a lock | |
|
| first gotten on a previous instance. Generally these requests are | | first obtained from a previous instance. Generally these requests | |
| variants of the requests normally used to create locks of that type | | are variants of the requests normally used to create locks of that | |
| and are referred to as "reclaim-type" requests and the process of re- | | type and are referred to as "reclaim-type" requests and the process | |
| establishing such locks is referred to as "reclaiming" them. | | of re-establishing such locks is referred to as "reclaiming" them. | |
| | | | |
| Because each client must have an opportunity to reclaim all of the | | Because each client must have an opportunity to reclaim all of the | |
| locks that it has without the possibility that some other client will | | locks that it has without the possibility that some other client will | |
| be granted a conflicting lock, a special period called the "grace | | be granted a conflicting lock, a special period called the "grace | |
| period" is devoted to the reclaim process. During this period, only | | period" is devoted to the reclaim process. During this period, only | |
| reclaim-type locking requests are allowed, unless the server is able | | reclaim-type locking requests are allowed, unless the server is able | |
| to reliably determine (through state persistently maintained across | | to reliably determine (through state persistently maintained across | |
| reboot instances), that granting any such lock cannot possibly | | reboot instances), that granting any such lock cannot possibly | |
| conflict with a subsequent reclaim. When a request is made to obtain | | conflict with a subsequent reclaim. When a request is made to obtain | |
| a new lock (i.e. not a reclaim-type request) during the grace period | | a new lock (i.e. not a reclaim-type request) during the grace period | |
| | | | |
| skipping to change at page 152, line 4 | | skipping to change at page 154, line 13 | |
| reboot instances), that granting any such lock cannot possibly | | reboot instances), that granting any such lock cannot possibly | |
| conflict with a subsequent reclaim. When a request is made to obtain | | conflict with a subsequent reclaim. When a request is made to obtain | |
| a new lock (i.e. not a reclaim-type request) during the grace period | | a new lock (i.e. not a reclaim-type request) during the grace period | |
| and such a determination cannot be made, the server must return the | | and such a determination cannot be made, the server must return the | |
| error NFS4ERR_GRACE. | | error NFS4ERR_GRACE. | |
| | | | |
| Once a session is established using the new client ID, the client | | Once a session is established using the new client ID, the client | |
| will use reclaim-type locking requests (e.g. LOCK requests with | | will use reclaim-type locking requests (e.g. LOCK requests with | |
| reclaim set to true and OPEN operations with a claim type of | | reclaim set to true and OPEN operations with a claim type of | |
| CLAIM_PREVIOUS. See Section 9.8) to re-establish its locking state. | | CLAIM_PREVIOUS. See Section 9.8) to re-establish its locking state. | |
|
| | | | |
| Once this is done, or if there is no such locking state to reclaim, | | Once this is done, or if there is no such locking state to reclaim, | |
|
| the client does a global RECLAIM_COMPLETE operation, i.e. one with | | the client sends a global RECLAIM_COMPLETE operation, i.e. one with | |
| the one_fs argument set to false, to indicate that it has reclaimed | | the one_fs argument set to false, to indicate that it has reclaimed | |
|
| all of the locking state that it will reclaim. Once a client does | | all of the locking state that it will reclaim. Once a client sends | |
| such a RECLAIM_COMPLETE operation, it may attempt non-reclaim locking | | such a RECLAIM_COMPLETE operation, it may attempt non-reclaim locking | |
|
| operations, although it may get NFS4ERR_GRACE errors on these until | | operations, although it may get NFS4ERR_GRACE errors the operations | |
| the period of special handling is over. See Section 11.6.7 for a | | until the period of special handling is over. See Section 11.6.7 for | |
| discussion of the analogous handling lock reclamation in the case of | | a discussion of the analogous handling lock reclamation in the case | |
| filesystems transitioning from server to server. | | of filesystems transitioning from server to server. | |
| | | | |
| Note that if the client ID persisted through a server reboot (which | | Note that if the client ID persisted through a server reboot (which | |
| will be self-evident if the client never received a | | will be self-evident if the client never received a | |
| NFS4ERR_STALE_CLIENTID error, and instead got | | NFS4ERR_STALE_CLIENTID error, and instead got | |
| SEQ4_STATUS_RESTART_RECLAIM_NEEDED status from SEQUENCE | | SEQ4_STATUS_RESTART_RECLAIM_NEEDED status from SEQUENCE | |
| (Section 18.46.4), no client ID was re-established. See Paragraph 2 | | (Section 18.46.4), no client ID was re-established. See Paragraph 2 | |
| of Section 9.8 for discussion of some restrictions on use of upgrade | | of Section 9.8 for discussion of some restrictions on use of upgrade | |
| semantics in connection with reclaim that are the result of some | | semantics in connection with reclaim that are the result of some | |
| issues that apply to this situation. | | issues that apply to this situation. | |
| | | | |
| | | | |
| skipping to change at page 168, line 52 | | skipping to change at page 171, line 11 | |
| When multiple open files on the client are merged into a single open | | When multiple open files on the client are merged into a single open | |
| file object on the server, the close of one of the open files (on the | | file object on the server, the close of one of the open files (on the | |
| client) may necessitate change of the access and deny status of the | | client) may necessitate change of the access and deny status of the | |
| open file on the server. This is because the union of the access and | | open file on the server. This is because the union of the access and | |
| deny bits for the remaining opens may be smaller (i.e. a proper | | deny bits for the remaining opens may be smaller (i.e. a proper | |
| subset) than previously. The OPEN_DOWNGRADE operation is used to | | subset) than previously. The OPEN_DOWNGRADE operation is used to | |
| make the necessary change and the client should use it to update the | | make the necessary change and the client should use it to update the | |
| server so that share reservation requests by other clients are | | server so that share reservation requests by other clients are | |
| handled properly. | | handled properly. | |
| | | | |
|
| Because of the possibility that the client will issue multiple open | | Because of the possibility that the client will issue multiple opens | |
| for the same owner in parallel, it may be the case that a open | | for the same owner in parallel, it may be the case that a open | |
|
| upgrade ay happen without the client knowing beforehand that this | | upgrade may happen without the client knowing beforehand that this | |
| could happen. Because of this possiblity, CLOSEs and | | could happen. Because of this possiblity, CLOSEs and | |
| OPEN_DOWNGRADEs, should generally be issued with a non-zero seqid in | | OPEN_DOWNGRADEs, should generally be issued with a non-zero seqid in | |
| the stateid, to avoid the possibility that the status change | | the stateid, to avoid the possibility that the status change | |
|
| associated with an open upgrade is not inadvertanyly lost. | | associated with an open upgrade is not inadvertently lost. | |
| | | | |
| 9.8. Reclaim of Open and Byte-range Locks | | 9.8. Reclaim of Open and Byte-range Locks | |
| | | | |
| Special forms of the LOCK and OPEN operations are provided when it is | | Special forms of the LOCK and OPEN operations are provided when it is | |
| necessary to re-establish byte-range locks or opens after a server | | necessary to re-establish byte-range locks or opens after a server | |
| failure. | | failure. | |
| | | | |
| o To reclaim existing opens, an OPEN operation is performed using a | | o To reclaim existing opens, an OPEN operation is performed using a | |
| CLAIM_PREVIOUS. Because the client, in this type of situation, | | CLAIM_PREVIOUS. Because the client, in this type of situation, | |
| will have already opened the file and have the filehandle of the | | will have already opened the file and have the filehandle of the | |
| | | | |
| skipping to change at page 186, line 47 | | skipping to change at page 189, line 27 | |
| delegation voluntarily. The following items of state need to be | | delegation voluntarily. The following items of state need to be | |
| dealt with: | | dealt with: | |
| | | | |
| o If the file associated with the delegation is no longer open and | | o If the file associated with the delegation is no longer open and | |
| no previous CLOSE operation has been sent to the server, a CLOSE | | no previous CLOSE operation has been sent to the server, a CLOSE | |
| operation must be sent to the server. | | operation must be sent to the server. | |
| | | | |
| o If a file has other open references at the client, then OPEN | | o If a file has other open references at the client, then OPEN | |
| operations must be sent to the server. The appropriate stateids | | operations must be sent to the server. The appropriate stateids | |
| will be provided by the server for subsequent use by the client | | will be provided by the server for subsequent use by the client | |
|
| since the delegation stateid will not longer be valid. These OPEN | | since the delegation stateid will no longer be valid. These OPEN | |
| requests are done with the claim type of CLAIM_DELEGATE_CUR. This | | requests are done with the claim type of CLAIM_DELEGATE_CUR. This | |
| will allow the presentation of the delegation stateid so that the | | will allow the presentation of the delegation stateid so that the | |
| client can establish the appropriate rights to perform the OPEN. | | client can establish the appropriate rights to perform the OPEN. | |
| (see the Section 18.16 which describes the OPEN" operation for | | (see the Section 18.16 which describes the OPEN" operation for | |
| details.) | | details.) | |
|
| | | | |
| o If there are granted file locks, the corresponding LOCK operations | | o If there are granted file locks, the corresponding LOCK operations | |
| need to be performed. This applies to the write open delegation | | need to be performed. This applies to the write open delegation | |
| case only. | | case only. | |
| | | | |
| o For a write open delegation, if at the time of recall the file is | | o For a write open delegation, if at the time of recall the file is | |
| not open for write, all modified data for the file must be flushed | | not open for write, all modified data for the file must be flushed | |
| to the server. If the delegation had not existed, the client | | to the server. If the delegation had not existed, the client | |
| would have done this data flush before the CLOSE operation. | | would have done this data flush before the CLOSE operation. | |
| | | | |
| o For a write open delegation when a file is still open at the time | | o For a write open delegation when a file is still open at the time | |
| | | | |
| skipping to change at page 197, line 10 | | skipping to change at page 199, line 32 | |
| operation change attribute values atomically. When the server is | | operation change attribute values atomically. When the server is | |
| unable to report the before and after values atomically with respect | | unable to report the before and after values atomically with respect | |
| to the directory operation, the server must indicate that fact in the | | to the directory operation, the server must indicate that fact in the | |
| change_info4 return value. When the information is not atomically | | change_info4 return value. When the information is not atomically | |
| reported, the client should not assume that other clients have not | | reported, the client should not assume that other clients have not | |
| changed the directory. | | changed the directory. | |
| | | | |
| 11. Multi-Server Namespace | | 11. Multi-Server Namespace | |
| | | | |
| NFSv4.1 supports attributes that allow a namespace to extend beyond | | NFSv4.1 supports attributes that allow a namespace to extend beyond | |
|
| the boundaries of a single server. Use of such multi-server | | the boundaries of a single server. It is recommended that clients | |
| namespaces is optional, and for many purposes, single-server | | and servers support construction of such multi-server namespaces. | |
| namespace are perfectly acceptable. Use of multi-server namespaces | | Use of such multi-server namespaces is OPTIONAL however, and for many | |
| can provide many advantages, however, by separating a file system's | | purposes, single-server namespace are perfectly acceptable. Use of | |
| logical position in a namespace from the (possibly changing) | | multi-server namespaces can provide many advantages, however, by | |
| logistical and administrative considerations that result in | | separating a file system's logical position in a namespace from the | |
| particular file systems being located on particular servers. | | (possibly changing) logistical and administrative considerations that | |
| | | result in particular file systems being located on particular | |
| | | servers. | |
| | | | |
|
| 11.1. Location attributes | | 11.1. Location Attributes | |
| | | | |
| NFSv4 contains recommended attributes that allow file systems on one | | NFSv4 contains recommended attributes that allow file systems on one | |
| server to be associated with one or more instances of that file | | server to be associated with one or more instances of that file | |
| system on other servers. These attributes specify such file systems | | system on other servers. These attributes specify such file systems | |
| by specifying a server name (either a DNS name or an IP address) | | by specifying a server name (either a DNS name or an IP address) | |
| together with the path of that file system within that server's | | together with the path of that file system within that server's | |
| single-server namespace. | | single-server namespace. | |
| | | | |
|
| The fs_locations_info recommended attribute allows specification of | | The fs_locations_info RECOMMENDED attribute allows specification of | |
| one more file systems instance locations where the data corresponding | | one or more filesystem instance locations where the data | |
| to a given file system may be found. This attribute provides to the | | corresponding to a given file system may be found. This attribute | |
| client, in addition to information about file system instance | | provides to the client, in addition to information about file system | |
| locations, extensive information about the various file system | | instance locations, significant information about the various file | |
| instance choices (e.g. priority for use, writability, currency, etc.) | | system instance choices (e.g. priority for use, writability, | |
| as well as information to help the client efficiently effect as | | currency, etc.). It also includes information to help the client | |
| seamless a transition as possible among multiple file system | | efficiently effect as seamless a transition as possible among | |
| instances, when and if that should be necessary. | | multiple file system instances, when and if that should be necessary. | |
| | | | |
|
| The fs_locations recommended attribute is inherited from NFSv4.0 and | | The fs_locations RECOMMENDED attribute is inherited from NFSv4.0 and | |
| only allows specification of the file system locations where the data | | only allows specification of the file system locations where the data | |
|
| corresponding to a given file system may be found. Servers should | | corresponding to a given file system may be found. Servers SHOULD | |
| make this attribute available whenever fs_locations_info is | | make this attribute available whenever fs_locations_info is | |
| supported, but client use of fs_locations_info is to be preferred. | | supported, but client use of fs_locations_info is to be preferred. | |
| | | | |
| 11.2. File System Presence or Absence | | 11.2. File System Presence or Absence | |
| | | | |
| A given location in an NFSv4 namespace (typically but not necessarily | | A given location in an NFSv4 namespace (typically but not necessarily | |
| a multi-server namespace) can have a number of file system instance | | a multi-server namespace) can have a number of file system instance | |
| locations associated with it (via the fs_locations or | | locations associated with it (via the fs_locations or | |
| fs_locations_info attribute). There may also be an actual current | | fs_locations_info attribute). There may also be an actual current | |
| file system at that location, accessible via normal namespace | | file system at that location, accessible via normal namespace | |
| operations (e.g. LOOKUP). In this case, the file system is said to | | operations (e.g. LOOKUP). In this case, the file system is said to | |
| be "present" at that position in the namespace and clients will | | be "present" at that position in the namespace and clients will | |
| typically use it, reserving use of additional locations specified via | | typically use it, reserving use of additional locations specified via | |
| the location-related attributes to situations in which the principal | | the location-related attributes to situations in which the principal | |
| location is no longer available. | | location is no longer available. | |
| | | | |
| When there is no actual file system at the namespace location in | | When there is no actual file system at the namespace location in | |
| question, the file system is said to be "absent". An absent file | | question, the file system is said to be "absent". An absent file | |
|
| system contains no files or directories other than the root and any | | system contains no files or directories other than the root. Any | |
| reference to it, except to access a small set of attributes useful in | | reference to it, except to access a small set of attributes useful in | |
| determining alternate locations, will result in an error, | | determining alternate locations, will result in an error, | |
|
| NFS4ERR_MOVED. Note that if the server ever returns NFS4ERR_MOVED | | NFS4ERR_MOVED. Note that if the server ever returns the error | |
| (i.e. file systems may be absent), it MUST support the fs_locations | | NFS4ERR_MOVED, it MUST support the fs_locations attribute and SHOULD | |
| attribute and SHOULD support the fs_locations_info and fs_absent | | support the fs_locations_info and fs_status attributes. | |
| attributes. | | | |
| | | | |
| While the error name suggests that we have a case of a file system | | While the error name suggests that we have a case of a file system | |
| which once was present, and has only become absent later, this is | | which once was present, and has only become absent later, this is | |
| only one possibility. A position in the namespace may be permanently | | only one possibility. A position in the namespace may be permanently | |
|
| absent with the file system(s) designated by the location attributes | | absent with the set of file system(s) designated by the location | |
| the only realization. The name NFS4ERR_MOVED reflects an earlier, | | attributes being the only realization. The name NFS4ERR_MOVED | |
| more limited conception of its function, but this error will be | | reflects an earlier, more limited conception of its function, but | |
| returned whenever the referenced file system is absent, whether it | | this error will be returned whenever the referenced file system is | |
| has moved or not. | | absent, whether it has moved or not. | |
| | | | |
| Except in the case of GETATTR-type operations (to be discussed | | Except in the case of GETATTR-type operations (to be discussed | |
| later), when the current filehandle at the start of an operation is | | later), when the current filehandle at the start of an operation is | |
| within an absent file system, that operation is not performed and the | | within an absent file system, that operation is not performed and the | |
| error NFS4ERR_MOVED returned, to indicate that the file system is | | error NFS4ERR_MOVED returned, to indicate that the file system is | |
| absent on the current server. | | absent on the current server. | |
| | | | |
| Because a GETFH cannot succeed if the current filehandle is within an | | Because a GETFH cannot succeed if the current filehandle is within an | |
| absent file system, filehandles within an absent file system cannot | | absent file system, filehandles within an absent file system cannot | |
| be transferred to the client. When a client does have filehandles | | be transferred to the client. When a client does have filehandles | |
| within an absent file system, it is the result of obtaining them when | | within an absent file system, it is the result of obtaining them when | |
| the file system was present, and having the file system become absent | | the file system was present, and having the file system become absent | |
| subsequently. | | subsequently. | |
| | | | |
| It should be noted that because the check for the current filehandle | | It should be noted that because the check for the current filehandle | |
| being within an absent file system happens at the start of every | | being within an absent file system happens at the start of every | |
|
| operation, operations which change the current filehandle so that it | | operation, operations that change the current filehandle so that it | |
| is within an absent file system will not result in an error. This | | is within an absent file system will not result in an error. This | |
| allows such combinations as PUTFH-GETATTR and LOOKUP-GETATTR to be | | allows such combinations as PUTFH-GETATTR and LOOKUP-GETATTR to be | |
| used to get attribute information, particularly location attribute | | used to get attribute information, particularly location attribute | |
| information, as discussed below. | | information, as discussed below. | |
| | | | |
|
| The recommended file system attribute fs_absent can used to | | The recommended file system attribute fs_status can be used to | |
| interrogate the present/absent status of a given file system. | | interrogate the present/absent status of a given file system. | |
| | | | |
| 11.3. Getting Attributes for an Absent File System | | 11.3. Getting Attributes for an Absent File System | |
| | | | |
| When a file system is absent, most attributes are not available, but | | When a file system is absent, most attributes are not available, but | |
| it is necessary to allow the client access to the small set of | | it is necessary to allow the client access to the small set of | |
| attributes that are available, and most particularly those that give | | attributes that are available, and most particularly those that give | |
| information about the correct current locations for this file system, | | information about the correct current locations for this file system, | |
| fs_locations and fs_locations_info. | | fs_locations and fs_locations_info. | |
| | | | |
| 11.3.1. GETATTR Within an Absent File System | | 11.3.1. GETATTR Within an Absent File System | |
| | | | |
| As mentioned above, an exception is made for GETATTR in that | | As mentioned above, an exception is made for GETATTR in that | |
| attributes may be obtained for a filehandle within an absent file | | attributes may be obtained for a filehandle within an absent file | |
| system. This exception only applies if the attribute mask contains | | system. This exception only applies if the attribute mask contains | |
| at least one attribute bit that indicates the client is interested in | | at least one attribute bit that indicates the client is interested in | |
| a result regarding an absent file system: fs_locations, | | a result regarding an absent file system: fs_locations, | |
|
| fs_locations_info, or fs_absent. If none of these attributes is | | fs_locations_info, or fs_status. If none of these attributes is | |
| requested, GETATTR will result in an NFS4ERR_MOVED error. | | requested, GETATTR will result in an NFS4ERR_MOVED error. | |
| | | | |
| When a GETATTR is done on an absent file system, the set of supported | | When a GETATTR is done on an absent file system, the set of supported | |
| attributes is very limited. Many attributes, including those that | | attributes is very limited. Many attributes, including those that | |
|
| are normally mandatory will not be available on an absent file | | are normally mandatory, will not be available on an absent file | |
| system. In addition to the attributes mentioned above (fs_locations, | | system. In addition to the attributes mentioned above (fs_locations, | |
|
| fs_locations_info, fs_absent), the following attributes SHOULD be | | fs_locations_info, fs_status), the following attributes SHOULD be | |
| available on absent file systems, in the case of recommended | | available on absent file systems, in the case of recommended | |
| attributes at least to the same degree that they are available on | | attributes at least to the same degree that they are available on | |
| present file systems. | | present file systems. | |
| | | | |
|
| change: This attribute is useful for absent file systems and can be | | change_policy: This attribute is useful for absent file systems and | |
| helpful in summarizing to the client when any of the location- | | can be helpful in summarizing to the client when any of the | |
| related attributes changes. | | location-related attributes changes. | |
| | | | |
| fsid: This attribute should be provided so that the client can | | fsid: This attribute should be provided so that the client can | |
| determine file system boundaries, including, in particular, the | | determine file system boundaries, including, in particular, the | |
|
| boundary between present and absent file systems. | | boundary between present and absent file systems. This value must | |
| | | be different from any other fsid on the current server and need | |
| | | have no particular relationship to fsids on any particular | |
| | | destination to which the client might be directed. | |
| | | | |
| mounted_on_fileid: For objects at the top of an absent file system | | mounted_on_fileid: For objects at the top of an absent file system | |
| this attribute needs to be available. Since the fileid is one | | this attribute needs to be available. Since the fileid is one | |
| which is within the present parent file system, there should be no | | which is within the present parent file system, there should be no | |
| need to reference the absent file system to provide this | | need to reference the absent file system to provide this | |
| information. | | information. | |
| | | | |
| Other attributes SHOULD NOT be made available for absent file | | Other attributes SHOULD NOT be made available for absent file | |
| systems, even when it is possible to provide them. The server should | | systems, even when it is possible to provide them. The server should | |
| not assume that more information is always better and should avoid | | not assume that more information is always better and should avoid | |
| gratuitously providing additional information. | | gratuitously providing additional information. | |
| | | | |
| When a GETATTR operation includes a bit mask for one of the | | When a GETATTR operation includes a bit mask for one of the | |
|
| attributes fs_locations, fs_locations_info, or absent, but where the | | attributes fs_locations, fs_locations_info, or fs_status, but where | |
| bit mask includes attributes which are not supported, GETATTR will | | the bit mask includes attributes which are not supported, GETATTR | |
| not return an error, but will return the mask of the actual | | will not return an error, but will return the mask of the actual | |
| attributes supported with the results. | | attributes supported with the results. | |
| | | | |
| Handling of VERIFY/NVERIFY is similar to GETATTR in that if the | | Handling of VERIFY/NVERIFY is similar to GETATTR in that if the | |
| attribute mask does not include fs_locations, fs_locations_info, or | | attribute mask does not include fs_locations, fs_locations_info, or | |
|
| fs_absent, the error NFS4ERR_MOVED will result. It differs in that | | fs_status, the error NFS4ERR_MOVED will result. It differs in that | |
| any appearance in the attribute mask of an attribute not supported | | any appearance in the attribute mask of an attribute not supported | |
| for an absent file system (and note that this will include some | | for an absent file system (and note that this will include some | |
| normally mandatory attributes), will also cause an NFS4ERR_MOVED | | normally mandatory attributes), will also cause an NFS4ERR_MOVED | |
| result. | | result. | |
| | | | |
| 11.3.2. READDIR and Absent File Systems | | 11.3.2. READDIR and Absent File Systems | |
| | | | |
| A READDIR performed when the current filehandle is within an absent | | A READDIR performed when the current filehandle is within an absent | |
| file system will result in an NFS4ERR_MOVED error, since, unlike the | | file system will result in an NFS4ERR_MOVED error, since, unlike the | |
| case of GETATTR, no such exception is made for READDIR. | | case of GETATTR, no such exception is made for READDIR. | |
| | | | |
| Attributes for an absent file system may be fetched via a READDIR for | | Attributes for an absent file system may be fetched via a READDIR for | |
| a directory in a present file system, when that directory contains | | a directory in a present file system, when that directory contains | |
| the root directories of one or more absent file systems. In this | | the root directories of one or more absent file systems. In this | |
| case, the handling is as follows: | | case, the handling is as follows: | |
| | | | |
| o If the attribute set requested includes one of the attributes | | o If the attribute set requested includes one of the attributes | |
|
| fs_locations, fs_locations_info, or fs_absent, then fetching of | | fs_locations, fs_locations_info, or fs_status, then fetching of | |
| attributes proceeds normally and no NFS4ERR_MOVED indication is | | attributes proceeds normally and no NFS4ERR_MOVED indication is | |
| returned, even when the rdattr_error attribute is requested. | | returned, even when the rdattr_error attribute is requested. | |
| | | | |
| o If the attribute set requested does not include one of the | | o If the attribute set requested does not include one of the | |
|
| attributes fs_locations, fs_locations_info, or fs_absent, then if | | attributes fs_locations, fs_locations_info, or fs_status, then if | |
| the rdattr_error attribute is requested, each directory entry for | | the rdattr_error attribute is requested, each directory entry for | |
| the root of an absent file system, will report NFS4ERR_MOVED as | | the root of an absent file system, will report NFS4ERR_MOVED as | |
| the value of the rdattr_error attribute. | | the value of the rdattr_error attribute. | |
| | | | |
| o If the attribute set requested does not include any of the | | o If the attribute set requested does not include any of the | |
|
| attributes fs_locations, fs_locations_info, fs_absent, or | | attributes fs_locations, fs_locations_info, fs_status, or | |
| rdattr_error then the occurrence of the root of an absent file | | rdattr_error then the occurrence of the root of an absent file | |
| system within the directory will result in the READDIR failing | | system within the directory will result in the READDIR failing | |
|
| with an NFSERR_MOVED error. | | with an NFS4ERR_MOVED error. | |
| | | | |
| o The unavailability of an attribute because of a file system's | | o The unavailability of an attribute because of a file system's | |
| absence, even one that is ordinarily mandatory, does not result in | | absence, even one that is ordinarily mandatory, does not result in | |
| any error indication. The set of attributes returned for the root | | any error indication. The set of attributes returned for the root | |
| directory of the absent file system in that case is simply | | directory of the absent file system in that case is simply | |
| restricted to those actually available. | | restricted to those actually available. | |
| | | | |
| 11.4. Uses of Location Information | | 11.4. Uses of Location Information | |
| | | | |
| The location-bearing attributes (fs_locations and fs_locations_info), | | The location-bearing attributes (fs_locations and fs_locations_info), | |
| provide, together with the possibility of absent file systems, a | | provide, together with the possibility of absent file systems, a | |
| number of important facilities in providing reliable, manageable, and | | number of important facilities in providing reliable, manageable, and | |
| scalable data access. | | scalable data access. | |
| | | | |
|
| When a file system is present, these attribute can provide | | When a file system is present, these attributes can provide | |
| alternative locations, to be used to access the same data, in the | | alternative locations, to be used to access the same data, in the | |
|
| event that server failures, communications problems, or other | | event of server failures, communications problems, or other | |
| difficulties, make continued access to the current file system | | difficulties that make continued access to the current file system | |
| impossible or otherwise impractical. Under some circumstances | | impossible or otherwise impractical. Under some circumstances | |
| multiple alternative locations may be used simultaneously to provide | | multiple alternative locations may be used simultaneously to provide | |
| higher performance access to the file system in question. Provision | | higher performance access to the file system in question. Provision | |
| of such alternate locations is referred to as "replication" although | | of such alternate locations is referred to as "replication" although | |
| there are cases in which replicated sets of data are not in fact | | there are cases in which replicated sets of data are not in fact | |
| present, and the replicas are instead different paths to the same | | present, and the replicas are instead different paths to the same | |
| data. | | data. | |
| | | | |
| When a file system is present and becomes absent, clients can be | | When a file system is present and becomes absent, clients can be | |
| given the opportunity to have continued access to their data, at an | | given the opportunity to have continued access to their data, at an | |
| alternate location. In this case, a continued attempt to use the | | alternate location. In this case, a continued attempt to use the | |
|
| data in the now-absent file system will result in an NFSERR_MOVED | | data in the now-absent file system will result in an NFS4ERR_MOVED | |
| error and at that point the successor locations (typically only one | | error and at that point the successor locations (typically only one | |
| but multiple choices are possible) can be fetched and used to | | but multiple choices are possible) can be fetched and used to | |
| continue access. Transfer of the file system contents to the new | | continue access. Transfer of the file system contents to the new | |
| location is referred to as "migration", but it should be kept in mind | | location is referred to as "migration", but it should be kept in mind | |
| that there are cases in which this term can be used, like | | that there are cases in which this term can be used, like | |
| "replication", when there is no actual data migration per se. | | "replication", when there is no actual data migration per se. | |
| | | | |
| Where a file system was not previously present, specification of file | | Where a file system was not previously present, specification of file | |
| system location provides a means by which file systems located on one | | system location provides a means by which file systems located on one | |
| server can be associated with a namespace defined by another server, | | server can be associated with a namespace defined by another server, | |
| thus allowing a general multi-server namespace facility. Designation | | thus allowing a general multi-server namespace facility. Designation | |
| of such a location, in place of an absent file system, is called | | of such a location, in place of an absent file system, is called | |
| "referral". | | "referral". | |
| | | | |
|
| | | Because client support for location-related attributes is OPTIONAL, a | |
| | | server may (but is not required to) take action to hide migration and | |
| | | referral events from such clients, by acting as a proxy, for example. | |
| | | The server can determine the presence client support from data passed | |
| | | in the EXCHANGE_ID operation (See Section 18.35.4). | |
| | | | |
| 11.4.1. File System Replication | | 11.4.1. File System Replication | |
| | | | |
| The fs_locations and fs_locations_info attributes provide alternative | | The fs_locations and fs_locations_info attributes provide alternative | |
|
| locations, to be used to access data in place of or in a addition to | | locations, to be used to access data in place of or in addition to | |
| the current file system instance. On first access to a file system, | | the current file system instance. On first access to a file system, | |
|
| the client should obtain the value of the set alternate locations by | | the client should obtain the value of the set of alternate locations | |
| interrogating the fs_locations or fs_locations_info attribute, with | | by interrogating the fs_locations or fs_locations_info attribute, | |
| the latter being preferred. | | with the latter being preferred. | |
| | | | |
| In the event that server failures, communications problems, or other | | In the event that server failures, communications problems, or other | |
|
| difficulties, make continued access to the current file system | | difficulties make continued access to the current file system | |
| impossible or otherwise impractical, the client can use the alternate | | impossible or otherwise impractical, the client can use the alternate | |
| locations as a way to get continued access to his data. Depending on | | locations as a way to get continued access to his data. Depending on | |
| specific attributes of these alternate locations, as indicated within | | specific attributes of these alternate locations, as indicated within | |
| the fs_locations_info attribute, multiple locations may be used | | the fs_locations_info attribute, multiple locations may be used | |
| simultaneously, to provide higher performance through the | | simultaneously, to provide higher performance through the | |
| exploitation of multiple paths between client and target file system. | | exploitation of multiple paths between client and target file system. | |
| | | | |
| The alternate locations may be physical replicas of the (typically | | The alternate locations may be physical replicas of the (typically | |
| read-only) file system data, or they may reflect alternate paths to | | read-only) file system data, or they may reflect alternate paths to | |
|
| the same server or provide for the use of various form of server | | the same server or provide for the use of various forms of server | |
| clustering in which multiple servers provide alternate ways of | | clustering in which multiple servers provide alternate ways of | |
| accessing the same physical file system. How these different modes | | accessing the same physical file system. How these different modes | |
| of file system transition are represented within the fs_locations and | | of file system transition are represented within the fs_locations and | |
| fs_locations_info attributes and how the client deals with file | | fs_locations_info attributes and how the client deals with file | |
| system transition issues will be discussed in detail below. | | system transition issues will be discussed in detail below. | |
| | | | |
|
| When multiple server addresses correspond to the same actual server, | | Multiple server addresses may correspond to the same actual server, | |
| as shown by a common so_major_id field within the eir_server_owner | | as shown by a common so_major_id field within the eir_server_owner | |
|
| field returned by EXCHANGE_ID, the client may assume that for each | | field returned by EXCHANGE_ID (see Section 18.35.4). When such | |
| file system in the namespace of a given server network address, there | | server addresses exist, the client may assume that for each file | |
| | | system in the namespace of a given server network address, there | |
| exist file systems at corresponding namespace locations for each of | | exist file systems at corresponding namespace locations for each of | |
|
| the other server network addresses, even in the absence of explicit | | the other server network addresses. It may do this even in the | |
| listing in fs_locations and fs_locations_info. Such corresponding | | absence of explicit listing in fs_locations and fs_locations_info. | |
| file system locations can be used as alternate locations, just as | | Such corresponding file system locations can be used as alternate | |
| those explicitly specified via the fs_locations and fs_locations_info | | locations, just as those explicitly specified via the fs_locations | |
| attributes. Where these specific locations are designated in the | | and fs_locations_info attributes. Where these specific locations are | |
| fs_locations_info attribute, the conditions of use specified in this | | designated in the fs_locations_info attribute, the conditions of use | |
| attribute (e.g. priorities, specification of simultaneous use) may | | specified in this attribute (e.g. priorities, specification of | |
| limit the clients use of these alternate locations. | | simultaneous use) may limit the client's use of these alternate | |
| | | locations. | |
| When multiple replicas exist and are used simultaneously or in | | | |
| succession by a client, they must designate the same data (with | | | |
| metadata being the same to the degree indicated by the | | | |
| fs_locations_info attribute). Where file systems are writable, a | | | |
| change made on one instance must be visible on all instances, | | | |
| immediately upon the earlier of the return of the modifying request | | | |
| or the visibility of that change on any of the associated replicas. | | | |
| Where a file system is not writable but represents a read-only copy | | | |
| (possibly periodically updated) of a writable file system, similar | | | |
| requirements apply to the propagation of updates. It must be | | | |
| guaranteed that any change visible on the original file system | | | |
| instance must be immediately visible on any replica before the client | | | |
| transitions access to that replica, to avoid any possibility, that a | | | |
| client in effecting a transition to a replica, will see any reversion | | | |
| in file system state. The specific means by which this will be | | | |
| prevented varies based on fs4_status_type reported as part of the | | | |
| fs_status attribute. (See Section 11.11). | | | |
| | | | |
| 11.4.2. File System Migration | | 11.4.2. File System Migration | |
| | | | |
| When a file system is present and becomes absent, clients can be | | When a file system is present and becomes absent, clients can be | |
| given the opportunity to have continued access to their data, at an | | given the opportunity to have continued access to their data, at an | |
| alternate location, as specified by the fs_locations or | | alternate location, as specified by the fs_locations or | |
| fs_locations_info attribute. Typically, a client will be accessing | | fs_locations_info attribute. Typically, a client will be accessing | |
| the file system in question, get an NFS4ERR_MOVED error, and then use | | the file system in question, get an NFS4ERR_MOVED error, and then use | |
| the fs_locations or fs_locations_info attribute to determine the new | | the fs_locations or fs_locations_info attribute to determine the new | |
| location of the data. When fs_locations_info is used, additional | | location of the data. When fs_locations_info is used, additional | |
| | | | |
| skipping to change at page 203, line 34 | | skipping to change at page 205, line 46 | |
| | | | |
| The new location may be an alternate communication path to the same | | The new location may be an alternate communication path to the same | |
| server, or, in the case of various forms of server clustering, | | server, or, in the case of various forms of server clustering, | |
| another server providing access to the same physical file system. | | another server providing access to the same physical file system. | |
| The client's responsibilities in dealing with this transition depend | | The client's responsibilities in dealing with this transition depend | |
| on the specific nature of the new access path and how and whether | | on the specific nature of the new access path and how and whether | |
| data was in fact migrated. These issues will be discussed in detail | | data was in fact migrated. These issues will be discussed in detail | |
| below. | | below. | |
| | | | |
| When multiple server addresses correspond to the same actual server, | | When multiple server addresses correspond to the same actual server, | |
|
| as shown by a common value for so_major_id field of the | | as shown by a common value for the so_major_id field of the | |
| eir_server_owner field returned by EXCHANGE_ID, the location or | | eir_server_owner field returned by EXCHANGE_ID, the location or | |
| locations may designate alternate server addresses in the form of | | locations may designate alternate server addresses in the form of | |
|
| specific server network addresses, when the file system in question | | specific server network addresses. These could be used to access the | |
| is available at those addresses, and no longer accessible at the | | file system in question at those addresses and when it is no longer | |
| original address. | | accessible at the original address. | |
| | | | |
| Although a single successor location is typical, multiple locations | | Although a single successor location is typical, multiple locations | |
| may be provided, together with information that allows priority among | | may be provided, together with information that allows priority among | |
| the choices to be indicated, via information in the fs_locations_info | | the choices to be indicated, via information in the fs_locations_info | |
| attribute. Where suitable clustering mechanisms make it possible to | | attribute. Where suitable clustering mechanisms make it possible to | |
| provide multiple identical file systems or paths to them, this allows | | provide multiple identical file systems or paths to them, this allows | |
| the client the opportunity to deal with any resource or | | the client the opportunity to deal with any resource or | |
| communications issues that might limit data availability. | | communications issues that might limit data availability. | |
| | | | |
| When an alternate location is designated as the target for migration, | | When an alternate location is designated as the target for migration, | |
| | | | |
| skipping to change at page 204, line 14 | | skipping to change at page 206, line 27 | |
| be visible on all migration targets. Where a file system is not | | be visible on all migration targets. Where a file system is not | |
| writable but represents a read-only copy (possibly periodically | | writable but represents a read-only copy (possibly periodically | |
| updated) of a writable file system, similar requirements apply to the | | updated) of a writable file system, similar requirements apply to the | |
| propagation of updates. Any change visible in the original file | | propagation of updates. Any change visible in the original file | |
| system must already be effected on all migration targets, to avoid | | system must already be effected on all migration targets, to avoid | |
| any possibility, that a client in effecting a transition to the | | any possibility, that a client in effecting a transition to the | |
| migration target will see any reversion in file system state. | | migration target will see any reversion in file system state. | |
| | | | |
| 11.4.3. Referrals | | 11.4.3. Referrals | |
| | | | |
|
| Referrals provide a way of placing a file system in a location | | Referrals provide a way of placing a file system in a location within | |
| essentially without respect to its physical location on a given | | the namespace essentially without respect to its physical location on | |
| server. This allows a single server of a set of servers to present a | | a given server. This allows a single server or a set of servers to | |
| multi-server namespace that encompasses file systems located on | | present a multi-server namespace that encompasses file systems | |
| multiple servers. Some likely uses of this include establishment of | | located on multiple servers. Some likely uses of this include | |
| site-wide or organization-wide namespaces, or even knitting such | | establishment of site-wide or organization-wide namespaces, or even | |
| together into a truly global namespace. | | knitting such together into a truly global namespace. | |
| | | | |
| Referrals occur when a client determines, upon first referencing a | | Referrals occur when a client determines, upon first referencing a | |
| position in the current namespace, that it is part of a new file | | position in the current namespace, that it is part of a new file | |
| system and that that file system is absent. When this occurs, | | system and that that file system is absent. When this occurs, | |
| typically by receiving the error NFS4ERR_MOVED, the actual location | | typically by receiving the error NFS4ERR_MOVED, the actual location | |
| or locations of the file system can be determined by fetching the | | or locations of the file system can be determined by fetching the | |
| fs_locations or fs_locations_info attribute. | | fs_locations or fs_locations_info attribute. | |
| | | | |
| The locations-related attribute may designate a single file system | | The locations-related attribute may designate a single file system | |
| location or multiple file system locations, to be selected based on | | location or multiple file system locations, to be selected based on | |
| the needs of the client. The server, in the fs_locations_info | | the needs of the client. The server, in the fs_locations_info | |
| attribute may specify priorities to be associated with various file | | attribute may specify priorities to be associated with various file | |
| system location choices. The server may assign different priorities | | system location choices. The server may assign different priorities | |
| to different locations as reported to individual clients, in order to | | to different locations as reported to individual clients, in order to | |
| adapt to client physical location or to effect load balancing. When | | adapt to client physical location or to effect load balancing. When | |
| both read-only and read-write file systems are present, some of the | | both read-only and read-write file systems are present, some of the | |
|
| read-only locations may not absolutely up-to-date (as they would have | | read-only locations may not be absolutely up-to-date (as they would | |
| to be in the case of replication and migration). Servers may also | | have to be in the case of replication and migration). Servers may | |
| specify file system locations that include client-substituted | | also specify file system locations that include client-substituted | |
| variable so that different clients are referred to different file | | variables so that different clients are referred to different file | |
| systems (with different data contents) based on client attributes | | systems (with different data contents) based on client attributes | |
|
| such as cpu architecture. | | such as CPU architecture. | |
| | | | |
| | | When the fs_locations_info attribute indicates that there are | |
| | | multiple possible targets listed, the relationships among them may be | |
| | | important to the client in selecting the one to use. The same rules | |
| | | specified in Section 11.4.1 defining the appropriate standards for | |
| | | the data propagation, apply to these multiple replicas as well. For | |
| | | example, the client might prefer a writable that has additional | |
| | | writable replicas to which it subsequently might switch. Note that, | |
| | | as distinguished from the case of replication, there is no need to | |
| | | deal with the case of propagation of updates made by the current | |
| | | client, since the current client has not accessed the filesystem in | |
| | | question. | |
| | | | |
| Use of multi-server namespaces is enabled by NFSv4 but is not | | Use of multi-server namespaces is enabled by NFSv4 but is not | |
| required. The use of multi-server namespaces and their scope will | | required. The use of multi-server namespaces and their scope will | |
| depend on the applications used, and system administration | | depend on the applications used, and system administration | |
| preferences. | | preferences. | |
| | | | |
| Multi-server namespaces can be established by a single server | | Multi-server namespaces can be established by a single server | |
| providing a large set of referrals to all of the included file | | providing a large set of referrals to all of the included file | |
| systems. Alternatively, a single multi-server namespace may be | | systems. Alternatively, a single multi-server namespace may be | |
| administratively segmented with separate referral file systems (on | | administratively segmented with separate referral file systems (on | |
| separate servers) for each separately-administered section of the | | separate servers) for each separately-administered section of the | |
| namespace. Any segment or the top-level referral file system may use | | namespace. Any segment or the top-level referral file system may use | |
| replicated referral file systems for higher availability. | | replicated referral file systems for higher availability. | |
| | | | |
| Generally, multi-server namespaces are for the most part uniform, in | | Generally, multi-server namespaces are for the most part uniform, in | |
| that the same data made available to one client at a given location | | that the same data made available to one client at a given location | |
|
| in the namespace is made availably to all clients at that location. | | in the namespace is made available to all clients at that location. | |
| There are however facilities provided which allow different client to | | There are however facilities provided which allow different clients | |
| be directed to different sets of data, so as to adapt to such client | | to be directed to different sets of data, so as to adapt to such | |
| characteristics as cpu architecture. | | client characteristics as CPU architecture. | |
| | | | |
| 11.5. Additional Client-side Considerations | | 11.5. Additional Client-side Considerations | |
| | | | |
| When clients make use of servers that implement referrals, | | When clients make use of servers that implement referrals, | |
| replication, and migration, care should be taken so that a user who | | replication, and migration, care should be taken so that a user who | |
| mounts a given file system that includes a referral or a relocated | | mounts a given file system that includes a referral or a relocated | |
|
| file system continue to see a coherent picture of that user-side file | | file system continues to see a coherent picture of that user-side | |
| system despite the fact that it contains a number of server-side file | | file system despite the fact that it contains a number of server-side | |
| systems which may be on different servers. | | file systems which may be on different servers. | |
| | | | |
| One important issue is upward navigation from the root of a server- | | One important issue is upward navigation from the root of a server- | |
|
| side file system to its parent (specified as ".." in UNIX). The | | side file system to its parent (specified as ".." in UNIX), in the | |
| client needs to determine when it hits an fsid root going up the file | | case in which it transitions to that filesystem as a result of | |
| tree. When at such a point, and needs to ascend to the parent, it | | referral, migration, or a transition as a result of replication. | |
| must do so locally instead of sending a LOOKUPP call to the server. | | When at such a point, and it needs to ascend to the parent, it must | |
| The LOOKUPP would normally return the ancestor of the target file | | go back to the parent as seen within the multi-server namespace | |
| system on the target server, which may not be part of the space that | | rather issuing a LOOKUPP call to the server, which would result in | |
| the client mounted. | | the parent within that server's single-server namespace. In order to | |
| | | do this, the client needs to remember the filehandles that represent | |
| A related issue is upward navigation from named attribute | | such filesystem roots, and use these instead of issuing a LOOKUPP to | |
| directories. The named attribute directories are essentially | | the current server. This will allow the client to present to | |
| detached from the namespace and this property should be safely | | applications a consistent namespace, where upward navigation and | |
| represented in the client operating environment. LOOKUPP on a named | | downward navigation are consistent. | |
| attribute directory may return the filehandle of the associated file | | | |
| and conveying this to applications might be unsafe as many | | | |
| applications expect the parent of a directory to be a directory by | | | |
| itself. Therefore the client may want to hide the parent of named | | | |
| attribute directories (represented as ".." in UNIX) or represent the | | | |
| named attribute directory as its own parent (as typically done for | | | |
| the file system root directory in UNIX) | | | |
| | | | |
| Another issue concerns refresh of referral locations. When referrals | | Another issue concerns refresh of referral locations. When referrals | |
| are used extensively, they may change as server configurations | | are used extensively, they may change as server configurations | |
| change. It is expected that clients will cache information related | | change. It is expected that clients will cache information related | |
| to traversing referrals so that future client side requests are | | to traversing referrals so that future client side requests are | |
| resolved locally without server communication. This is usually | | resolved locally without server communication. This is usually | |
| rooted in client-side name lookup caching. Clients should | | rooted in client-side name lookup caching. Clients should | |
| periodically purge this data for referral points in order to detect | | periodically purge this data for referral points in order to detect | |
|
| changes in location information. When the change attribute changes | | changes in location information. When the change_policy attribute | |
| for directories that hold referral entries or for the referral | | changes for directories that hold referral entries or for the | |
| entries themselves, clients should consider any associated cached | | referral entries themselves, clients should consider any associated | |
| referral information to be out of date. | | cached referral information to be out of date. | |
| | | | |
| 11.6. Effecting File System Transitions | | 11.6. Effecting File System Transitions | |
| | | | |
| Transitions between file system instances, whether due to switching | | Transitions between file system instances, whether due to switching | |
|
| between replicas upon server unavailability, or in response to a | | between replicas upon server unavailability, or in response to | |
| server-initiated migration events are best dealt with together. Even | | server-initiated migration events are best dealt with together. This | |
| though the prototypical use cases of replication and migration | | is so even though for the server pragmatic considerations will | |
| contain distinctive sets of features, when all possibilities for | | normally force different implementation strategies for planned and | |
| these operations are considered, the underlying unity of these | | unplanned transitions. Even though the prototypical use cases of | |
| operations, from the client's point of view is clear, even though for | | replication and migration contain distinctive sets of features, when | |
| the server pragmatic considerations will normally force different | | all possibilities for these operations are considered, there is an | |
| implementation strategies for planned and unplanned transitions. | | underlying unity of these operations, from the client's point of | |
| | | view, that makes treating them together desirable. | |
| | | | |
| A number of methods are possible for servers to replicate data and to | | A number of methods are possible for servers to replicate data and to | |
| track client state in order to allow clients to transition between | | track client state in order to allow clients to transition between | |
| file system instances with a minimum of disruption. Such methods | | file system instances with a minimum of disruption. Such methods | |
| vary between those that use inter-server clustering techniques to | | vary between those that use inter-server clustering techniques to | |
| limit the changes seen by the client, to those that are less | | limit the changes seen by the client, to those that are less | |
| aggressive, use more standard methods of replicating data, and impose | | aggressive, use more standard methods of replicating data, and impose | |
| a greater burden on the client to adapt to the transition. | | a greater burden on the client to adapt to the transition. | |
| | | | |
| The NFSv4.1 protocol does not impose choices on clients and servers | | The NFSv4.1 protocol does not impose choices on clients and servers | |
| | | | |
| skipping to change at page 207, line 9 | | skipping to change at page 209, line 27 | |
| types. Two file systems that belong to such a class share some | | types. Two file systems that belong to such a class share some | |
| important aspect of file system behavior that clients may depend upon | | important aspect of file system behavior that clients may depend upon | |
| when present, to easily effect a seamless transition between file | | when present, to easily effect a seamless transition between file | |
| system instances. Conversely, where the file systems do not belong | | system instances. Conversely, where the file systems do not belong | |
| to such a common class, the client has to deal with various sorts of | | to such a common class, the client has to deal with various sorts of | |
| implementation discontinuities which may cause performance or other | | implementation discontinuities which may cause performance or other | |
| issues in effecting a transition. | | issues in effecting a transition. | |
| | | | |
| Where the fs_locations_info attribute is available, such file system | | Where the fs_locations_info attribute is available, such file system | |
| classification data will be made directly available to the client. | | classification data will be made directly available to the client. | |
|
| See Section 11.10 for details. When only fs_locations is available, | | See Section 11.9 for details. When only fs_locations is available, | |
| default assumptions with regard to such classifications have to be | | default assumptions with regard to such classifications have to be | |
|
| inferred. See Section 11.9 for details. | | inferred. See Section 11.8 for details. | |
| | | | |
| In cases in which one server is expected to accept opaque values from | | In cases in which one server is expected to accept opaque values from | |
|
| the client that originated from another server, it is a wise | | the client that originated from another server, the servers SHOULD | |
| implementation practice for the servers to encode the "opaque" values | | encode the "opaque" values in big endian octet order. If this is | |
| in big endian octet order. If this is done, servers acting as | | done, servers acting as replicas or immigrating file systems will be | |
| replicas or immigrating file systems will be able to parse values | | able to parse values like stateids, directory cookies, filehandles, | |
| like stateids, directory cookies, filehandles, etc. even if their | | etc. even if their native octet order is different from that of other | |
| native octet order is different from that of other servers | | servers cooperating in the replication and migration of the file | |
| cooperating in the replication and migration of the file system. | | system. | |
| | | | |
| 11.6.1. File System Transitions and Simultaneous Access | | 11.6.1. File System Transitions and Simultaneous Access | |
| | | | |
| When a single file system may be accessed at multiple locations, | | When a single file system may be accessed at multiple locations, | |
| whether this is because of an indication of file system identity as | | whether this is because of an indication of file system identity as | |
| reported by the fs_locations or fs_locations_info attributes or | | reported by the fs_locations or fs_locations_info attributes or | |
| because two file systems instances have corresponding locations on | | because two file systems instances have corresponding locations on | |
| server addresses which connect to the same server as indicated by a | | server addresses which connect to the same server as indicated by a | |
| common so_major_id field in the eir_server_owner field returned by | | common so_major_id field in the eir_server_owner field returned by | |
| EXCHANGE_ID, the client will, depending on specific circumstances as | | EXCHANGE_ID, the client will, depending on specific circumstances as | |
| discussed below, either: | | discussed below, either: | |
| | | | |
|
| o Access multiple instances simultaneously, as representing | | o The client accesses multiple instances simultaneously, as | |
| alternate paths to the same data and metadata. | | representing alternate paths to the same data and metadata. | |
| | | | |
| o The client accesses one instance (or set of instances) and then | | o The client accesses one instance (or set of instances) and then | |
| transitions to an alternative instance (or set of instances) as a | | transitions to an alternative instance (or set of instances) as a | |
| result of network issues, server unresponsiveness, or server- | | result of network issues, server unresponsiveness, or server- | |
| directed migration. The transition may involve changes in | | directed migration. The transition may involve changes in | |
|
| filehandles, fileids, the change attribute, and or locking state, | | filehandles, fileids, the change attribute, and/or locking state, | |
| depending on the attributes of the source and destination file | | depending on the attributes of the source and destination file | |
| system instances, as specified in the fs_locations_info attribute. | | system instances, as specified in the fs_locations_info attribute. | |
| | | | |
|
| Which of these choices is possible, and how a transition is effected | | Which of these choices is possible, and how a transition is effected, | |
| is governed by equivalence classes of file system instances as | | is governed by equivalence classes of file system instances as | |
| reported by the fs_locations_info attribute, and, for file systems | | reported by the fs_locations_info attribute, and, for file systems | |
| instances in the same location within multiple single-server | | instances in the same location within multiple single-server | |
|
| namespace, by the so_major_id field in the eir_server_owner field | | namespace as indicated by the so_major_id field in the | |
| returned by EXCHANGE_ID. | | eir_server_owner field returned by EXCHANGE_ID. | |
| | | | |
| 11.6.2. Simultaneous Use and Transparent Transitions | | 11.6.2. Simultaneous Use and Transparent Transitions | |
| | | | |
| When two file system instances have the same location within their | | When two file system instances have the same location within their | |
|
| respective single-server namespaces and those two server IP addresses | | respective single-server namespaces and those two server network | |
| return the so_major_id value in the eir_server_owner value returned | | addresses return the same so_major_id value in the eir_server_owner | |
| in response to EXCHANGE_ID, those file systems instances can be | | value returned in response to EXCHANGE_ID, those file systems | |
| treated as the same, and either used together simultaneously or | | instances can be treated as the same, and either used together | |
| serially with no transition activity required on the part of the | | simultaneously or serially with no transition activity required on | |
| client. | | the part of the client. In this case we refer to the transition as | |
| | | "transparent" and the client in transferring access from to the other | |
| | | is acting as it would in the event that communication is interrupted, | |
| | | with a new connection and possibly a new session being established to | |
| | | continue access to the same filesystem. | |
| | | | |
| Whether simultaneous use of the two file system instances is valid is | | Whether simultaneous use of the two file system instances is valid is | |
| controlled by whether the fs_locations_info attribute shows the two | | controlled by whether the fs_locations_info attribute shows the two | |
|
| instances as having the same _simultaneous-use_ class. | | instances as having the same _simultaneous-use_ class. See | |
| | | Section 11.9.1 for information about the definition of the various | |
| | | use classes, including the _simultaneous-use_ class. | |
| | | | |
| Note that for two such file systems, any information within the | | Note that for two such file systems, any information within the | |
| fs_locations_info attribute that indicates the need for special | | fs_locations_info attribute that indicates the need for special | |
| transition activity, i.e. the appearance of the two file system | | transition activity, i.e. the appearance of the two file system | |
|
| instances with different _handle_, _fileid_, _verifier_, _change_ | | instances with different _handle_, _fileid_, _write-verifier_, | |
| classes, MUST be ignored by the client. The server SHOULD not | | _change_, _readdir_ classes, indicates a serious problem and the | |
| indicate that these instances belong to different _handle_, _fileid_, | | client, if it allows transition to the filesystem instance at all, | |
| _verifier_, _change_ classes, whether the two instances are shown | | must not treat this as a transparent transition. The server SHOULD | |
| belonging to the same _simultaneous-use_ class or not. | | NOT indicate that these instances belong to different _handle_, | |
| | | _fileid_, _write-verifier_, _change_, _readdir_ classes, whether the | |
| | | two instances are shown belonging to the same _simultaneous-use_ | |
| | | class or not. | |
| | | | |
| Where these conditions do not apply, a non-transparent file system | | Where these conditions do not apply, a non-transparent file system | |
| instance transition is required with the details depending on the | | instance transition is required with the details depending on the | |
|
| respective _handle_, _fileid_, _verifier_, _change_ classes of the | | respective _handle_, _fileid_, _verifier_, _change_, _readdir_ | |
| two file system instances and whether the two servers in question | | classes of the two file system instances and whether the two servers | |
| have the same eir_server_scope value as reported by EXCHANGE_ID. | | in question have the same eir_server_scope value as reported by | |
| | | EXCHANGE_ID. | |
| | | | |
| 11.6.2.1. Simultaneous Use of File System Instances | | 11.6.2.1. Simultaneous Use of File System Instances | |
| | | | |
| When the conditions above hold, in either of the following two cases, | | When the conditions above hold, in either of the following two cases, | |
| the client may use the two file system instances simultaneously. | | the client may use the two file system instances simultaneously. | |
| | | | |
|
| o The fs_locations_info attribute does not contain separate per-IP | | o The fs_locations_info attribute does not contain separate per- | |
| address entries for file systems instances at the distinct IP | | network-address entries for file systems instances at the distinct | |
| addresses. This includes the case in which the fs_locations_info | | network addresses. This includes the case in which the | |
| attribute is unavailable. | | fs_locations_info attribute is unavailable. In this case, the | |
| | | fact that the eir_server_owner values share an so_major_id value | |
| | | and this justifies simultaneous use and there is fs_locations_info | |
| | | attribute information contradicting that. | |
| | | | |
| o The fs_locations_info attribute indicates that two file system | | o The fs_locations_info attribute indicates that two file system | |
| instances belong to the same _simultaneous-use_ class. | | instances belong to the same _simultaneous-use_ class. | |
| | | | |
| In this case, the client may use both file system instances | | In this case, the client may use both file system instances | |
| simultaneously, as representations of the same file system, whether | | simultaneously, as representations of the same file system, whether | |
|
| that happens because the two IP addresses connect to the same | | that happens because the two network addresses connect to the same | |
| physical server or because different servers connect to clustered | | physical server or because different servers connect to clustered | |
| file systems and export their data in common. When simultaneous use | | file systems and export their data in common. When simultaneous use | |
| is in effect, any change made to one file system instance must be | | is in effect, any change made to one file system instance must be | |
| immediately reflected in the other file system instance(s). Locks | | immediately reflected in the other file system instance(s). Locks | |
| are treated as part of a common lease, associated with a common | | are treated as part of a common lease, associated with a common | |
| client ID. Depending on the details of the eir_server_owner returned | | client ID. Depending on the details of the eir_server_owner returned | |
| by EXCHANGE_ID, the two server instances may be accessed by different | | by EXCHANGE_ID, the two server instances may be accessed by different | |
| sessions or a single session in common. | | sessions or a single session in common. | |
| | | | |
| 11.6.2.2. Transparent File System Transitions | | 11.6.2.2. Transparent File System Transitions | |
| | | | |
| When the conditions above hold and the fs_locations_info attribute | | When the conditions above hold and the fs_locations_info attribute | |
|
| explicitly shows the file system instances for these distinct IP | | explicitly shows the file system instances for these distinct network | |
| addresses as belonging to different _simultaneous-use_ classes, the | | addresses as belonging to different _simultaneous-use_ classes, the | |
| file system instances should not be used by the client | | file system instances should not be used by the client | |
| simultaneously, but rather serially with one being used unless and | | simultaneously, but rather serially with one being used unless and | |
| until communication difficulties, lack of responsiveness, or an | | until communication difficulties, lack of responsiveness, or an | |
| explicit migration event causes another file system instance (or set | | explicit migration event causes another file system instance (or set | |
|
| of file system instances sharing a common _simultaneous-use_ class to | | of file system instances sharing a common _simultaneous-use_ class) | |
| be used. | | to be used. | |
| | | | |
|
| When a change in file system instance is to be done, the client will | | When a change of file system instance is to be done, the client will | |
| use the same client ID already in effect. If it already has | | use the same client ID already in effect. If it already has | |
| connections to the new server address, these will be used. Otherwise | | connections to the new server address, these will be used. Otherwise | |
| new connections to existing sessions or new sessions associated with | | new connections to existing sessions or new sessions associated with | |
| the existing client ID are established as indicated by the | | the existing client ID are established as indicated by the | |
| eir_server_owner returned by EXCHANGE_ID. | | eir_server_owner returned by EXCHANGE_ID. | |
| | | | |
| In all such transparent transition cases, the following apply: | | In all such transparent transition cases, the following apply: | |
| | | | |
| o File handles stay the same if persistent and if volatile are only | | o File handles stay the same if persistent and if volatile are only | |
| subject to expiration, if they would be in the absence of file | | subject to expiration, if they would be in the absence of file | |
| | | | |
| skipping to change at page 209, line 44 | | skipping to change at page 212, line 27 | |
| | | | |
| o Fileid values do not change across the transition. | | o Fileid values do not change across the transition. | |
| | | | |
| o The file system will have the same fsid in both the old and new | | o The file system will have the same fsid in both the old and new | |
| locations. | | locations. | |
| | | | |
| o Change attribute values are consistent across the transition and | | o Change attribute values are consistent across the transition and | |
| do not have to be refetched. When change attributes indicate that | | do not have to be refetched. When change attributes indicate that | |
| a cached object is still valid, it can remain cached. | | a cached object is still valid, it can remain cached. | |
| | | | |
|
| o Client, and state identifier retain their validity across the | | o Client and state identifiers retain their validity across the | |
| transition, except where their staleness is recognized and | | transition, except where their staleness is recognized and | |
| reported by the new server. Except where such staleness requires | | reported by the new server. Except where such staleness requires | |
|
| it, no lock reclamation is needed. | | it, no lock reclamation is needed. Any such staleness is an | |
| | | indication that the server should be considered to have rebooted | |
| | | and is reported as discussed in Section 8.4.2. | |
| | | | |
| o Write verifiers are presumed to retain their validity and can be | | o Write verifiers are presumed to retain their validity and can be | |
|
| presented to COMMIT, with the expectation that if COMMIT on the | | used to compare with verifiers returned by COMMIT on the new | |
| new server accept them as valid, then that server has all of the | | server, with the expectation that if COMMIT on the new server | |
| | | returns an identical verifier, then that server has all of the | |
| data unstably written to the original server and has committed it | | data unstably written to the original server and has committed it | |
| to stable storage as requested. | | to stable storage as requested. | |
| | | | |
|
| | | o Readdir cookies are presumed to retain their validity and can be | |
| | | presented to subsequent READDIR requests together with the readdir | |
| | | verifier with which they are associated. When the verifier is | |
| | | accepted as valid, the cookie will continue the READDIR operation | |
| | | so that the entire directory can be obtained by the client. | |
| | | | |
| 11.6.3. Filehandles and File System Transitions | | 11.6.3. Filehandles and File System Transitions | |
| | | | |
| There are a number of ways in which filehandles can be handled across | | There are a number of ways in which filehandles can be handled across | |
| a file system transition. These can be divided into two broad | | a file system transition. These can be divided into two broad | |
| classes depending upon whether the two file systems across which the | | classes depending upon whether the two file systems across which the | |
| transition happens share sufficient state to effect some sort of | | transition happens share sufficient state to effect some sort of | |
| continuity of file system handling. | | continuity of file system handling. | |
| | | | |
| When there is no such co-operation in filehandle assignment, the two | | When there is no such co-operation in filehandle assignment, the two | |
| file systems are reported as being in different _handle_ classes. In | | file systems are reported as being in different _handle_ classes. In | |
| this case, all filehandles are assumed to expire as part of the file | | this case, all filehandles are assumed to expire as part of the file | |
| system transition. Note that this behavior does not depend on | | system transition. Note that this behavior does not depend on | |
| fh_expire_type attribute and supersedes the specification of | | fh_expire_type attribute and supersedes the specification of | |
| FH4_VOL_MIGRATION bit, which only affects behavior when | | FH4_VOL_MIGRATION bit, which only affects behavior when | |
| fs_locations_info is not available. | | fs_locations_info is not available. | |
| | | | |
| When there is co-operation in filehandle assignment, the two file | | When there is co-operation in filehandle assignment, the two file | |
| systems are reported as being in the same _handle_ classes. In this | | systems are reported as being in the same _handle_ classes. In this | |
|
| case, persistent filehandle remain valid after the file system | | case, persistent filehandles remain valid after the file system | |
| transition, while volatile filehandles (excluding those while are | | transition, while volatile filehandles (excluding those while are | |
| only volatile due to the FH4_VOL_MIGRATION bit) are subject to | | only volatile due to the FH4_VOL_MIGRATION bit) are subject to | |
| expiration on the target server. | | expiration on the target server. | |
| | | | |
|
| 11.6.4. Fileid's and File System Transitions | | 11.6.4. Fileids and File System Transitions | |
| | | | |
|
| In NFSv4.0, the issue of continuity of fileid's in the event of a | | In NFSv4.0, the issue of continuity of fileids in the event of a file | |
| file system transition was not addressed. The general expectation | | system transition was not addressed. The general expectation had | |
| had been that in situations in which the two file system instances | | been that in situations in which the two file system instances are | |
| are created by a single vendor using some sort of file system image | | created by a single vendor using some sort of file system image copy, | |
| copy, fileid's will be consistent across the transition while in the | | fileids will be consistent across the transition while in the | |
| analogous multi-vendor transitions they will not. This poses | | analogous multi-vendor transitions they will not. This poses | |
| difficulties, especially for the client without special knowledge of | | difficulties, especially for the client without special knowledge of | |
|
| the of the transition mechanisms adopted by the server. | | the transition mechanisms adopted by the server. Note that although | |
| | | fileid is not a mandatory attributes, many servers provided them and | |
| | | many clients provide API's that depend on them. | |
| | | | |
| It is important to note that while clients themselves may have no | | It is important to note that while clients themselves may have no | |
| trouble with a fileid changing as a result of a file system | | trouble with a fileid changing as a result of a file system | |
| transition event, applications do typically have access to the fileid | | transition event, applications do typically have access to the fileid | |
| (e.g. via stat), and the result of this is that an application may | | (e.g. via stat), and the result of this is that an application may | |
| work perfectly well if there is no file system instance transition or | | work perfectly well if there is no file system instance transition or | |
| if any such transition is among instances created by a single vendor, | | if any such transition is among instances created by a single vendor, | |
| yet be unable to deal with the situation in which a multi-vendor | | yet be unable to deal with the situation in which a multi-vendor | |
| transition occurs, at the wrong time. | | transition occurs, at the wrong time. | |
| | | | |
|
| Providing the same fileid's in a multi-vendor (multiple server | | Providing the same fileids in a multi-vendor (multiple server | |
| vendors) environment has generally been held to be quite difficult. | | vendors) environment has generally been held to be quite difficult. | |
|
| | | | |
| While there is work to be done, it needs to be pointed out that this | | While there is work to be done, it needs to be pointed out that this | |
| difficulty is partly self-imposed. Servers have typically identified | | difficulty is partly self-imposed. Servers have typically identified | |
| fileid with inode number, i.e. with a quantity used to find the file | | fileid with inode number, i.e. with a quantity used to find the file | |
| in question. This identification poses special difficulties for | | in question. This identification poses special difficulties for | |
|
| migration of an fs between vendors where assigning the same index to | | migration of a filesystem between vendors where assigning the same | |
| a given file may not be possible. Note here that a fileid does not | | index to a given file may not be possible. Note here that a fileid | |
| require that it be useful to find the file in question, only that it | | is not required to be useful to find the file in question, only that | |
| is unique within the given fs. Servers prepared to accept a fileid | | it is unique within the given filesystem. Servers prepared to accept | |
| as a single piece of metadata and store it apart from the value used | | a fileid as a single piece of metadata and store it apart from the | |
| to index the file information can relatively easily maintain a fileid | | value used to index the file information can relatively easily | |
| value across a migration event, allowing a truly transparent | | maintain a fileid value across a migration event, allowing a truly | |
| migration event. | | transparent migration event. | |
| | | | |
| In any case, where servers can provide continuity of fileids, they | | In any case, where servers can provide continuity of fileids, they | |
|
| should and the client should be able to find out that such continuity | | should, and the client should be able to find out that such | |
| is available, and take appropriate action. Information about the | | continuity is available and take appropriate action. Information | |
| continuity (or lack thereof) of fileid's across a file system is | | about the continuity (or lack thereof) of fileids across a file | |
| represented by specifying whether the file systems in question are of | | system transition is represented by specifying whether the file | |
| the same _fileid_ class. | | systems in question are of the same _fileid_ class. | |
| | | | |
| | | Note that when consistent fileids do not exist across a transition | |
| | | (either because there is no continuity of fileids or because fileid | |
| | | is not a supported attribute on one of instances involved), and there | |
| | | are no reliable filehandles across a transition event (either because | |
| | | there is no filehandle continuity or because the filehandles are | |
| | | volatile), the client is in a position where it cannot verify that | |
| | | files it was accessing before the transition are the same objects. | |
| | | It is forced to assume that no object has been renamed, and, unless | |
| | | there are guarantees that provide this (e.g. the filesystem is read- | |
| | | only), problems for applications may occur. Therefore, use of such | |
| | | configurations should be limited to situations where the problems | |
| | | that this may cause can be tolerated. | |
| | | | |
| 11.6.5. Fsids and File System Transitions | | 11.6.5. Fsids and File System Transitions | |
| | | | |
|
| Since fsids are only unique within a per-server basis, it is to be | | Since fsids are generally only unique within a per-server basis, it | |
| expected that they will change during a file system transition. | | is likely that they will change during a file system transition. One | |
| Clients should not make the fsid's received from the server visible | | exception is the case of transparent transitions, but in that case we | |
| to application since they may not be globally unique, and because | | have multiple network addresses that are defined as the same server | |
| they may change during a file system transition event. Applications | | (as specified by so_major_id field of eir_server_owner). Clients | |
| are best served if they are isolated from such transitions to the | | should not make the fsids received from the server visible to | |
| extent possible. | | applications since they may not be globally unique, and because they | |
| | | may change during a file system transition event. Applications are | |
| | | best served if they are isolated from such transitions to the extent | |
| | | possible. | |
| | | | |
| | | Although normally, a single source filesystem will transition to a | |
| | | single target filesystem, there is a provision for splitting a single | |
| | | source filesystem into multiple target filesystems, by specifying the | |
| | | FSLI4F_MULTI_FS flag. | |
| | | | |
| | | 11.6.5.1. File System Splitting | |
| | | | |
| When a file system transition is made and the fs_locations_info | | When a file system transition is made and the fs_locations_info | |
|
| indicates that file system in question may be split into multiple | | indicates that the file system in question may be split into multiple | |
| file systems (via the FSLI4F_MULTI_FS flag), client should do | | file systems (via the FSLI4F_MULTI_FS flag), the client SHOULD do | |
| GETATTR's on all known objects within the file system undergoing | | GETATTRs to determine the fsid attribute on all known objects within | |
| transition, to determine the new file system boundaries. Clients may | | the file system undergoing transition to determine the new file | |
| maintain the fsid's passed to existing applications by mapping all of | | system boundaries. | |
| the fsid for the descendent file systems to a the common fsid used | | | |
| for the original file system. | | Clients may maintain the fsids passed to existing applications by | |
| | | mapping all of the fsids for the descendent file systems to the | |
| | | common fsid used for the original file system. | |
| | | | |
| | | Splitting a filesystem may be done on a transition between | |
| | | filesystems of the same _fileid_ class, since the fact that fileids | |
| | | are unique within the source filesystem ensure they will be unique in | |
| | | each of the target filesystems. | |
| | | | |
| 11.6.6. The Change Attribute and File System Transitions | | 11.6.6. The Change Attribute and File System Transitions | |
| | | | |
| Since the change attribute is defined as a server-specific one, | | Since the change attribute is defined as a server-specific one, | |
| change attributes fetched from one server are normally presumed to be | | change attributes fetched from one server are normally presumed to be | |
| invalid on another server. Such a presumption is troublesome since | | invalid on another server. Such a presumption is troublesome since | |
| it would invalidate all cached change attributes, requiring | | it would invalidate all cached change attributes, requiring | |
| refetching. Even more disruptive, the absence of any assured | | refetching. Even more disruptive, the absence of any assured | |
| continuity for the change attribute means that even if the same value | | continuity for the change attribute means that even if the same value | |
| is gotten on refetch no conclusions can drawn as to whether the | | is gotten on refetch no conclusions can drawn as to whether the | |
| object in question has changed. The identical change attribute could | | object in question has changed. The identical change attribute could | |
|
| be merely an artifact, of a modified file with a different change | | be merely an artifact of a modified file with a different change | |
| attribute construction algorithm, with that new algorithm just | | attribute construction algorithm, with that new algorithm just | |
| happening to result in an identical change value. | | happening to result in an identical change value. | |
| | | | |
| When the two file systems have consistent change attribute formats, | | When the two file systems have consistent change attribute formats, | |
| and this fact is communicated to the client by reporting as in the | | and this fact is communicated to the client by reporting as in the | |
| same _change_ class, the client may assume a continuity of change | | same _change_ class, the client may assume a continuity of change | |
| attribute construction and handle this situation just as it would be | | attribute construction and handle this situation just as it would be | |
| handled without any file system transition. | | handled without any file system transition. | |
| | | | |
| 11.6.7. Lock State and File System Transitions | | 11.6.7. Lock State and File System Transitions | |
| | | | |
| In a file system transition, the client needs to handle cases in | | In a file system transition, the client needs to handle cases in | |
| which the two servers have cooperated in state management and in | | which the two servers have cooperated in state management and in | |
| which they have not. Cooperation by two servers in state management | | which they have not. Cooperation by two servers in state management | |
|
| requires coordination of clientids. Before the client attempts to | | requires coordination of client IDs. Before the client attempts to | |
| use a client ID associated with one server in a request to the server | | use a client ID associated with one server in a request to the server | |
| of the other file system, it must eliminate the possibility that two | | of the other file system, it must eliminate the possibility that two | |
| non-cooperating servers have assigned the same client ID by accident. | | non-cooperating servers have assigned the same client ID by accident. | |
| The client needs to compare the eir_server_scope values returned by | | The client needs to compare the eir_server_scope values returned by | |
| each server. If the scope values do not match, then the servers have | | each server. If the scope values do not match, then the servers have | |
| not cooperated in state management. If the scope values match, then | | not cooperated in state management. If the scope values match, then | |
|
| this indicates the servers have cooperated in assigning clientids to | | this indicates the servers have cooperated in assigning client IDs to | |
| the point that they will reject clientids that refer to state they do | | the point that they will reject client IDs that refer to state they | |
| not know about. | | do not know about. | |
| | | | |
| In the case of migration, the servers involved in the migration of a | | In the case of migration, the servers involved in the migration of a | |
| file system SHOULD transfer all server state from the original to the | | file system SHOULD transfer all server state from the original to the | |
|
| new server. When this done, it must be done in a way that is | | new server. When this is done, it must be done in a way that is | |
| transparent to the client. With replication, such a degree of common | | transparent to the client. With replication, such a degree of common | |
| state is typically not the case. Clients, however should use the | | state is typically not the case. Clients, however should use the | |
| information provided by the eir_server_scope returned by EXCHANGE_ID | | information provided by the eir_server_scope returned by EXCHANGE_ID | |
| to determine whether such sharing may be in effect, rather than | | to determine whether such sharing may be in effect, rather than | |
| making assumptions based on the reason for the transition. | | making assumptions based on the reason for the transition. | |
| | | | |
| This state transfer will reduce disruption to the client when a file | | This state transfer will reduce disruption to the client when a file | |
|
| system transition If the servers are successful in transferring all | | system transition occurs. If the servers are successful in | |
| state, the client can attempt to establish sessions associated with | | transferring all state, the client can attempt to establish sessions | |
| the client ID used for the source file system instance. If the | | associated with the client ID used for the source file system | |
| server accepts that as a valid client ID, then the client may used | | instance. If the server accepts that as a valid client ID, then the | |
| the existing stateid's associated with that client ID for the old | | client may use the existing stateids associated with that client ID | |
| file system instance in connection with the that same client ID in | | for the old file system instance in connection with that same client | |
| connection with the file system instance. | | ID in connection with the transitioned file system instance. | |
| | | | |
|
| When the two servers belong to the same server scope, it does | | When the two servers belong to the same server scope, it does not | |
| necessarily mean that when dealing with the transition, the client | | mean that when dealing with the transition, the client will not have | |
| will not have to reclaim state. However it does mean that the client | | to reclaim state. However it does mean that the client may proceed | |
| may proceed using his current client ID when establishing | | using his current client ID when establishing communication with the | |
| communication with the new server and the new server will either | | new server and the new server will either recognize the client ID as | |
| recognize the client ID as valid, or reject it, in which case locks | | valid, or reject it, in which case locks must be reclaimed by the | |
| must be reclaimed by the client. | | client. | |
| | | | |
| File systems co-operating in state management may actually share | | File systems co-operating in state management may actually share | |
| state or simply divide the id space so as to recognize (and reject as | | state or simply divide the id space so as to recognize (and reject as | |
|
| stale) each others state and clients id's. Servers which do share | | stale) each other's stateids and client IDs. Servers which do share | |
| state may not do so under all conditions or at all times. The | | state may not do so under all conditions or at all times. The | |
| requirement for the server is that if it cannot be sure in accepting | | requirement for the server is that if it cannot be sure in accepting | |
| a client ID that it reflects the locks the client was given, it must | | a client ID that it reflects the locks the client was given, it must | |
| treat all associated state as stale and report it as such to the | | treat all associated state as stale and report it as such to the | |
| client. | | client. | |
| | | | |
|
| When the two file systems instances are on servers that do not share | | When the two file system instances are on servers that do not share a | |
| a server scope value the client must establish a new client ID on the | | server scope value the client must establish a new client ID on the | |
| destination, if it does not have one already and reclaim if possible. | | destination, if it does not have one already, and reclaim locks if | |
| In this case, old stateids and client ID's should not be presented to | | possible. In this case, old stateids and client ID's should not be | |
| the new server since there is no assurance that they will not | | presented to the new server since there is no assurance that they | |
| conflict with IDs valid on that server. | | will not conflict with IDs valid on that server. | |
| | | | |
| In either case, when actual locks are not known to be maintained, the | | In either case, when actual locks are not known to be maintained, the | |
| destination server may establish a grace period specific to the given | | destination server may establish a grace period specific to the given | |
| file system, with non-reclaim locks being rejected for that file | | file system, with non-reclaim locks being rejected for that file | |
| system, even though normal locks are being granted for other file | | system, even though normal locks are being granted for other file | |
| systems. Clients should not infer the absence of a grace period for | | systems. Clients should not infer the absence of a grace period for | |
| file systems being transitioned to a server from responses to | | file systems being transitioned to a server from responses to | |
| requests for other file systems. | | requests for other file systems. | |
| | | | |
| In the case of lock reclamation for a given file system after a file | | In the case of lock reclamation for a given file system after a file | |
| system transition, edge conditions can arise similar to those for | | system transition, edge conditions can arise similar to those for | |
| reclaim after server reboot (although in the case of the planned | | reclaim after server reboot (although in the case of the planned | |
| state transfer associated with migration, these can be avoided by | | state transfer associated with migration, these can be avoided by | |
|
| securely recording lock state as part of state migration. Where the | | securely recording lock state as part of state migration). Unless | |
| destination server cannot guarantee that locks will not be | | the destination server can guarantee that locks will not be | |
| incorrectly granted, the destination server should not establish a | | incorrectly granted, the destination server should not allow lock | |
| file-system-specific grace period. | | reclaims and avoid establishing a grace period. | |
| | | | |
| Once all locks have been reclaimed, or there were no locks to | | Once all locks have been reclaimed, or there were no locks to | |
| reclaim, the client indicates that there are no more reclaims to be | | reclaim, the client indicates that there are no more reclaims to be | |
| done for the filesystem in question by issuing a RECLAIM_COMPLETE | | done for the filesystem in question by issuing a RECLAIM_COMPLETE | |
|
| operation with the one_fs paraneter set to true. Once this has been | | operation with the one_fs parameter set to true. Once this has been | |
| done, non-reclaim locking operations may be done, and any subsequent | | done, non-reclaim locking operations may be done, and any subsequent | |
| request to do reclaims will be rejected with the error | | request to do reclaims will be rejected with the error | |
| NFS4ERR_NO_GRACE. | | NFS4ERR_NO_GRACE. | |
| | | | |
|
| Information about client identity that may be propagated between | | Information about client identity may be propagated between servers | |
| servers in the form of client_owner4 and associated verifiers, under | | in the form of client_owner4 and associated verifiers, under the | |
| the assumption that the client presents the same values to all the | | assumption that the client presents the same values to all the | |
| servers with which it deals. | | servers with which it deals. | |
| | | | |
| Servers are encouraged to provide facilities to allow locks to be | | Servers are encouraged to provide facilities to allow locks to be | |
| reclaimed on the new server after a file system transition. Often, | | reclaimed on the new server after a file system transition. Often, | |
| however, in cases in which the two servers do not share a server | | however, in cases in which the two servers do not share a server | |
| scope value, such facilities may not be available and client should | | scope value, such facilities may not be available and client should | |
| be prepared to re-obtain locks, even though it is possible that the | | be prepared to re-obtain locks, even though it is possible that the | |
| client may have his LOCK or OPEN request denied due to a conflicting | | client may have his LOCK or OPEN request denied due to a conflicting | |
|
| lock. In some environments, such as the transition between read-only | | lock. | |
| file systems, such denial of locks should not pose large difficulties | | | |
| in practice. When an attempt to re-establish a lock on a new server | | The consequences of having no facilities available to reclaim locks | |
| is denied, the client should treat the situation as if his original | | on the sew server will depend on the type of environment. In some | |
| lock had been revoked. In all cases in which the lock is granted, | | environments, such as the transition between read-only file systems, | |
| the client cannot assume that no conflicting could have been granted | | such denial of locks should not pose large difficulties in practice. | |
| in the interim. Where change attribute continuity is present, the | | When an attempt to re-establish a lock on a new server is denied, the | |
| client may check the change attribute to check for unwanted file | | client should treat the situation as if his original lock had been | |
| | | revoked. Note that when the lock is granted, the client cannot | |
| | | assume that no conflicting lock could have been granted in the | |
| | | interim. Where change attribute continuity is present, the client | |
| | | may check the change attribute to check for unwanted file | |
| modifications. Where even this is not available, and the file system | | modifications. Where even this is not available, and the file system | |
| is not read-only, a client may reasonably treat all pending locks as | | is not read-only, a client may reasonably treat all pending locks as | |
| having been revoked. | | having been revoked. | |
| | | | |
| 11.6.7.1. Leases and File System Transitions | | 11.6.7.1. Leases and File System Transitions | |
| | | | |
| In the case of lease renewal, the client may not be submitting | | In the case of lease renewal, the client may not be submitting | |
| requests for a file system that has been transferred to another | | requests for a file system that has been transferred to another | |
| server. This can occur because of the lease renewal mechanism. The | | server. This can occur because of the lease renewal mechanism. The | |
|
| client renews leases for all file systems when submitting a request | | client renews the lease associated with all file systems when | |
| on an associated session, regardless of the specific file system | | submitting a request on an associated session, regardless of the | |
| being referenced. | | specific file system being referenced. | |
| | | | |
|
| In order for the client to schedule renewal of leases that may have | | In order for the client to schedule renewal of leases where there is | |
| been relocated to the new server, the client must find out about | | locking state that may have been relocated to the new server, the | |
| lease relocation before those leases expire. To accomplish this, the | | client must find out about lease relocation before those leases | |
| SEQUENCE operation will return the status bit | | expire. To accomplish this, the SEQUENCE operation will return the | |
| SEQ4_STATUS_LEASE_MOVED, if responsibility for any of the leases to | | status bit SEQ4_STATUS_LEASE_MOVED, if responsibility for any of the | |
| be renewed has been transferred to a new server. This condition will | | locking state renewed has been transferred to a new server. This | |
| continue until the client receives an NFS4ERR_MOVED error and the | | will continue until the client receives an NFS4ERR_MOVED error for | |
| server receives the subsequent GETATTR for the fs_locations or | | each of the filesystems for which there has been locking state | |
| fs_locations_info attribute for an access to each file system for | | relocation. | |
| which a lease has been moved to a new server. | | | |
| | | | |
| When a client receives an SEQ4_STATUS_LEASE_MOVED indication, it | | When a client receives an SEQ4_STATUS_LEASE_MOVED indication, it | |
| should perform an operation on each file system associated with the | | should perform an operation on each file system associated with the | |
|
| server in question. When the client receives an NFS4ERR_MOVED error, | | server where there is locking state for the current client associated | |
| the client can follow the normal process to obtain the new server | | with the filesystem in question. The client may choose to reference | |
| information (through the fs_locations and fs_locations_info | | all filesystems in the interests of simplicity but what is important | |
| attributes) and perform renewal of those leases on the new server, | | is that it must reference all filesystems for which there was locking | |
| unless information in fs_locations_info attribute shows that no state | | state where that state moved. Once the client receives an | |
| could have been transferred. If the server has not had state | | NFS4ERR_MOVED error for each filesystem, the SEQ4_STATUS_LEASE_MOVED | |
| transferred to it transparently, the client will receive | | indication is cleared. The client can terminate the process of | |
| NFS4ERR_STALE_CLIENTID from the new server, as described above, and | | checking filesystems once this indication is cleared, since there are | |
| the client can then reclaim locks as is done in the event of server | | no others for which locking state has moved. | |
| failure. [[Comment.8: Comment from Benny Halevy: server receives the | | | |
| subsequent GETATTR for the fs_locations or 10959 fs_locations_info | | A client may use GETATTR of the fs_status (or fs_locations_info) | |
| attribute for an access to each file system for 10960 which a lease | | attribute on all of the filesystems to get absence indications in a | |
| has been moved to a new server. This paragraph is somewhat troubling | | single (or a few) request(s), since absent filesystems will not cause | |
| as it says that the server may treat GETATTR as a state-changing | | an error in this context. However, it still must do an operation | |
| operation but the this state may last indefinitely if the client does | | which receives NFS4ERR_MOVED on each filesystem, or order to clear | |
| not query all file systems on the server. I think we need to provide | | the SEQ4_STATUS_LEASE_MOVED indication is cleared. | |
| a more precise recommendation to the client implementation that will | | | |
| deal with corner cases in this area. For example, the client knows | | Once the set of filesystems with transferred locking state has been | |
| exactly which file systems it has state on (based on state it keeps | | determined, the client can follow the normal process to obtain the | |
| in the client inode cache). When seeing SEQ4_STATUS_LEASE_MOVED it | | new server information (through the fs_locations and | |
| can do the GETATTR on each of these file systems to see where they | | fs_locations_info attributes) and perform renewal of those leases on | |
| were moved to. At this point the client and server should be back in | | the new server, unless information in fs_locations_info attribute | |
| sync and the client can resume normal operation. If it still gets | | shows that no state could have been transferred. If the server has | |
| SEQ4_STATUS_LEASE_MOVED and the state lingers (i.e. another scan of | | not had state transferred to it transparently, the client will | |
| the file systems it knows of does not yield new NFS4ERR_MOVED | | receive NFS4ERR_STALE_CLIENTID from the new server, as described | |
| indications) it can destroy the session to release all of its state | | above, and the client can then reclaim locks as is done in the event | |
| on the server and get back in sync with the server. It should be | | of server failure. | |
| said, however, that destroying the session clears the aforementioned | | | |
| lease_moved "state" (if it indeed does so).]] [[Comment.9: Comment | | | |
| from Trond: What does the error SEQ4_STATUS_LEASE_MOVED mean? A | | | |
| lease is supposed to be global to the client, whereas fs_locations | | | |
| returns information about a specific file system. What exactly is | | | |
| the client expected to do if the original server exported 2 file | | | |
| systems that are now being migrated to 2 different servers? ( [...] | | | |
| I still don't see [for example ] what particular operation is the | | | |
| client guaranteed to be able to perform on each file system?)]] | | | |
| | | | |
| 11.6.7.2. Transitions and the Lease_time Attribute | | 11.6.7.2. Transitions and the Lease_time Attribute | |
| | | | |
| In order that the client may appropriately manage its leases in the | | In order that the client may appropriately manage its leases in the | |
| case of a file system transition, the destination server must | | case of a file system transition, the destination server must | |
| establish proper values for the lease_time attribute. | | establish proper values for the lease_time attribute. | |
| | | | |
| When state is transferred transparently, that state should include | | When state is transferred transparently, that state should include | |
| the correct value of the lease_time attribute. The lease_time | | the correct value of the lease_time attribute. The lease_time | |
| attribute on the destination server must never be less than that on | | attribute on the destination server must never be less than that on | |
| the source since this would result in premature expiration of leases | | the source since this would result in premature expiration of leases | |
| granted by the source server. Upon transitions in which state is | | granted by the source server. Upon transitions in which state is | |
| transferred transparently, the client is under no obligation to re- | | transferred transparently, the client is under no obligation to re- | |
| fetch the lease_time attribute and may continue to use the value | | fetch the lease_time attribute and may continue to use the value | |
| previously fetched (on the source server). | | previously fetched (on the source server). | |
| | | | |
| If state has not been transferred transparently, either because the | | If state has not been transferred transparently, either because the | |
|
| associated servers are show as have different eir_server_scope | | associated servers are shown as having different eir_server_scope | |
| strings or because the client ID is rejected when presented to the | | strings or because the client ID is rejected when presented to the | |
| new server, the client should fetch the value of lease_time on the | | new server, the client should fetch the value of lease_time on the | |
| new (i.e. destination) server, and use it for subsequent locking | | new (i.e. destination) server, and use it for subsequent locking | |
| requests. However the server must respect a grace period at least as | | requests. However the server must respect a grace period at least as | |
| long as the lease_time on the source server, in order to ensure that | | long as the lease_time on the source server, in order to ensure that | |
| clients have ample time to reclaim their lock before potentially | | clients have ample time to reclaim their lock before potentially | |
| conflicting non-reclaimed locks are granted. | | conflicting non-reclaimed locks are granted. | |
| | | | |
| 11.6.8. Write Verifiers and File System Transitions | | 11.6.8. Write Verifiers and File System Transitions | |
| | | | |
| In a file system transition, the two file systems may be clustered in | | In a file system transition, the two file systems may be clustered in | |
| the handling of unstably written data. When this is the case, and | | the handling of unstably written data. When this is the case, and | |
|
| the two file systems belong to the same _verifier_ class, valid | | the two file systems belong to the same _write-verifier_ class, write | |
| verifiers from one system may be recognized by the other and | | verifiers returned from one system may be compared to those returned | |
| superfluous writes avoided. There is no requirement that all valid | | by the other and superfluous writes avoided. | |
| verifiers be recognized, but it cannot be the case that a verifier is | | | |
| recognized as valid when it is not. [NOTE: We need to resolve the | | | |
| issue of proper verifier scope]. | | | |
| | | | |
|
| When two file systems belong to different _verifier_ classes, the | | When two file systems belong to different _write-verifier_ classes, | |
| client must assume that all unstable writes in existence at the time | | any verifier generated by one must not be compared to one provided by | |
| file system transition, have been lost since there is no way the old | | the other. Instead, it should be treated as not equal even when the | |
| verifier can recognized as valid (or not) on the target server. | | values are identical. | |
| | | | |
| | | 11.6.9. Readdir Cookies and Verifiers and File System Transitions | |
| | | | |
| | | In a file system transition, the two file systems may be consistent | |
| | | in their handling of READDIR cookies and verifiers. When this is the | |
| | | case, and the two file systems belong to the same _readdir_ class, | |
| | | READDIR cookies and verifiers from one system may be recognized by | |
| | | the other and READDIR operations started on one server may be validly | |
| | | continued on the other, simply by presenting the cookie and verifier | |
| | | returned by a READDIR operation done on the first filesystem to the | |
| | | second. | |
| | | | |
| | | When two file systems belong to different _readdir_ classes, any | |
| | | READDIR cookie and verifier generated by one is not valid on the | |
| | | second, and must not be presented to that server by the client. The | |
| | | client should act as if the verifier was rejected. | |
| | | | |
| | | 11.6.10. File System Data and File System Transitions | |
| | | | |
| | | When multiple replicas exist and are used simultaneously or in | |
| | | succession by a client, applications using them will normally expect | |
| | | that they contain data the same data or data which is consistent with | |
| | | the normal sorts of changes that are made by other clients updating | |
| | | the data of the file system. (with metadata being the same to the | |
| | | degree indicated by the fs_locations_info attribute). However, when | |
| | | multiple filesystems are presented as replicas of one another, the | |
| | | precise relationship between the data of one and the data of another | |
| | | is not, as a general matter, specified by the NFSv4.1 protocol. It | |
| | | is quite possible to present as replicas filesystems where the data | |
| | | of those filesystems is sufficiently different that some applications | |
| | | have problems dealing with the transition between replicas. The | |
| | | namespace will typically be constructed so that applications can | |
| | | choose an appropriate level of support, so that in one position in | |
| | | the namespace a varied set of replicas will be listed while in | |
| | | another only those that are up-to-date may be considered replicas. | |
| | | The protocol does define three special cases of the relationship | |
| | | among replicas to be specified by the server and relied upon by | |
| | | clients: | |
| | | | |
| | | o When multiple server addresses correspond to the same actual | |
| | | server, as shown by a common so_major_id field within the | |
| | | eir_server_owner field returned by EXCHANGE_ID, the client may | |
| | | depend on the fact that changes to data, metadata, or locks made | |
| | | on one filesystem are immediately reflected on others. | |
| | | | |
| | | o When multiple replicas exist and are used simultaneously by a | |
| | | client(see the FSLIB4_CLSIMUL definition within | |
| | | fs_locations_info), they must designate the same data. Where file | |
| | | systems are writable, a change made on one instance must be | |
| | | visible on all instances, immediately upon the earlier of the | |
| | | return of the modifying requestor or the visibility of that change | |
| | | on any of the associated replicas. This allows a client to use | |
| | | these replicas simultaneously without any special adaptation to | |
| | | the fact that there are multiple replicas. In this case, locks, | |
| | | whether shared or byte-range, and delegations obtained one replica | |
| | | are immediately reflected on all replicas, even though these locks | |
| | | will be managed under a set of client IDs. | |
| | | | |
| | | o When one replica is designated as the successor instance to | |
| | | another existing instance after return NFS4ERR_MOVED (i.e. the | |
| | | case of migration), the client may depend on the fact that all | |
| | | changes securely made to data (uncommitted writes are dealt with | |
| | | in Section 11.6.8) on the original instance are made to the | |
| | | successor image. | |
| | | | |
| | | o Where a file system is not writable but represents a read-only | |
| | | copy (possibly periodically updated) of a writable file system, | |
| | | clients have similar requirements with regard to the propagation | |
| | | of updates. They may need a guarantee that any change visible on | |
| | | the original file system instance must be immediately visible on | |
| | | any replica before the client transitions access to that replica, | |
| | | in order to avoid any possibility that a client, in effecting a | |
| | | transition to a replica, will see any reversion in file system | |
| | | state. The specific means by which this will be prevented varies | |
| | | based on fs4_status_type reported as part of the fs_status | |
| | | attribute (See Section 11.10). Since these filesystems are | |
| | | presumed not to be suitable for simultaneous use, there is no | |
| | | specification of how locking is handled and it generally will be | |
| | | the case that locks obtained one filesystem will be separate from | |
| | | those on others. Since these are going to be read-only | |
| | | filesystems, this is not expected to pose an issue for clients or | |
| | | applications. | |
| | | | |
| 11.7. Effecting File System Referrals | | 11.7. Effecting File System Referrals | |
| | | | |
| Referrals are effected when an absent file system is encountered, and | | Referrals are effected when an absent file system is encountered, and | |
| one or more alternate locations are made available by the | | one or more alternate locations are made available by the | |
| fs_locations or fs_locations_info attributes. The client will | | fs_locations or fs_locations_info attributes. The client will | |
| typically get an NFS4ERR_MOVED error, fetch the appropriate location | | typically get an NFS4ERR_MOVED error, fetch the appropriate location | |
|
| information and proceed to access the file system on different | | information and proceed to access the file system on a different | |
| server, even though it retains its logical position within the | | server, even though it retains its logical position within the | |
|
| original namespace. | | original namespace. Referrals differ from migration events in that | |
| | | they happen only when the client has not previously referenced the | |
| | | file system in question (so there is nothing to transition). | |
| | | Referrals can only come into effect when an absent file system is | |
| | | encountered at its root. | |
| | | | |
| The examples given in the sections below are somewhat artificial in | | The examples given in the sections below are somewhat artificial in | |
| that an actual client will not typically do a multi-component lookup, | | that an actual client will not typically do a multi-component lookup, | |
| but will have cached information regarding the upper levels of the | | but will have cached information regarding the upper levels of the | |
| name hierarchy. However, these example are chosen to make the | | name hierarchy. However, these example are chosen to make the | |
| required behavior clear and easy to put within the scope of a small | | required behavior clear and easy to put within the scope of a small | |
| number of requests, without getting unduly into details of how | | number of requests, without getting unduly into details of how | |
| specific clients might choose to cache things. | | specific clients might choose to cache things. | |
| | | | |
| 11.7.1. Referral Example (LOOKUP) | | 11.7.1. Referral Example (LOOKUP) | |
| | | | |
| skipping to change at page 217, line 20 | | skipping to change at page 222, line 26 | |
| o LOOKUP "this" | | o LOOKUP "this" | |
| | | | |
| o LOOKUP "is" | | o LOOKUP "is" | |
| | | | |
| o LOOKUP "the" | | o LOOKUP "the" | |
| | | | |
| o LOOKUP "path" | | o LOOKUP "path" | |
| | | | |
| o GETFH | | o GETFH | |
| | | | |
|
| o GETATTR fsid,fileid,size,ctime | | o GETATTR fsid,fileid,size,time_modify | |
| | | | |
| Under the given circumstances, the following will be the result. | | Under the given circumstances, the following will be the result. | |
| | | | |
| o PUTROOTFH --> NFS_OK. The current fh is now the root of the | | o PUTROOTFH --> NFS_OK. The current fh is now the root of the | |
| pseudo-fs. | | pseudo-fs. | |
| | | | |
| o LOOKUP "this" --> NFS_OK. The current fh is for /this and is | | o LOOKUP "this" --> NFS_OK. The current fh is for /this and is | |
| within the pseudo-fs. | | within the pseudo-fs. | |
| | | | |
| o LOOKUP "is" --> NFS_OK. The current fh is for /this/is and is | | o LOOKUP "is" --> NFS_OK. The current fh is for /this/is and is | |
| within the pseudo-fs. | | within the pseudo-fs. | |
| | | | |
| o LOOKUP "the" --> NFS_OK. The current fh is for /this/is/the and | | o LOOKUP "the" --> NFS_OK. The current fh is for /this/is/the and | |
| is within the pseudo-fs. | | is within the pseudo-fs. | |
| | | | |
| o LOOKUP "path" --> NFS_OK. The current fh is for /this/is/the/path | | o LOOKUP "path" --> NFS_OK. The current fh is for /this/is/the/path | |
|
| and is within a new, absent fs, but ... the client will never see | | and is within a new, absent filesystem, but ... the client will | |
| the value of that fh. | | never see the value of that fh. | |
| | | | |
| o GETFH --> NFS4ERR_MOVED. Fails because current fh is in an absent | | o GETFH --> NFS4ERR_MOVED. Fails because current fh is in an absent | |
|
| fs at the start of the operation and the spec makes no exception | | filesystem at the start of the operation and the spec makes no | |
| for GETFH. | | exception for GETFH. | |
| | | | |
|
| o GETATTR fsid,fileid,size,ctime. Not executed because the failure | | o GETATTR fsid,fileid,size,time_modify. Not executed because the | |
| of the GETFH stops processing of the COMPOUND. | | failure of the GETFH stops processing of the COMPOUND. | |
| | | | |
| Given the failure of the GETFH, the client has the job of determining | | Given the failure of the GETFH, the client has the job of determining | |
| the root of the absent file system and where to find that file | | the root of the absent file system and where to find that file | |
| system, i.e. the server and path relative to that server's root fh. | | system, i.e. the server and path relative to that server's root fh. | |
| Note here that in this example, the client did not obtain filehandles | | Note here that in this example, the client did not obtain filehandles | |
| and attribute information (e.g. fsid) for the intermediate | | and attribute information (e.g. fsid) for the intermediate | |
| directories, so that he would not be sure where the absent file | | directories, so that he would not be sure where the absent file | |
| system starts. It could be the case, for example, that /this/is/the | | system starts. It could be the case, for example, that /this/is/the | |
| is the root of the moved file system and that the reason that the | | is the root of the moved file system and that the reason that the | |
| lookup of "path" succeeded is that the file system was not absent on | | lookup of "path" succeeded is that the file system was not absent on | |
| that op but was moved between the last LOOKUP and the GETFH (since | | that op but was moved between the last LOOKUP and the GETFH (since | |
|
| COMPOUND is not atomic). Even if we had the fsid's for all of the | | COMPOUND is not atomic). Even if we had the fsids for all of the | |
| intermediate directories, we could have no way of knowing that /this/ | | intermediate directories, we could have no way of knowing that /this/ | |
|
| is/the/path was the root of a new fs, since we don't yet have its | | is/the/path was the root of a new filesystem, since we don't yet have | |
| fsid. | | its fsid. | |
| | | | |
| In order to get the necessary information, let us re-issue the chain | | In order to get the necessary information, let us re-issue the chain | |
|
| of lookup's with GETFH's and GETATTR's to at least get the fsid's so | | of LOOKUPs with GETFHs and GETATTRs to at least get the fsids so we | |
| we can be sure where the appropriate fs boundaries are. The client | | can be sure where the appropriate filesystem boundaries are. The | |
| could choose to get fs_locations_info at the same time but in most | | client could choose to get fs_locations_info at the same time but in | |
| cases the client will have a good guess as to where fs boundaries are | | most cases the client will have a good guess as to where fs | |
| (because of where NFS4ERR_MOVED was gotten and where not) making | | boundaries are (because of where NFS4ERR_MOVED was gotten and where | |
| fetching of fs_locations_info unnecessary. | | not) making fetching of fs_locations_info unnecessary. | |
| | | | |
| OP01: PUTROOTFH --> NFS_OK | | OP01: PUTROOTFH --> NFS_OK | |
| | | | |
| - Current fh is root of pseudo-fs. | | - Current fh is root of pseudo-fs. | |
| | | | |
| OP02: GETATTR(fsid) --> NFS_OK | | OP02: GETATTR(fsid) --> NFS_OK | |
| | | | |
| - Just for completeness. Normally, clients will know the fsid of | | - Just for completeness. Normally, clients will know the fsid of | |
| the pseudo-fs as soon as they establish communication with a | | the pseudo-fs as soon as they establish communication with a | |
| server. | | server. | |
| | | | |
| OP03: LOOKUP "this" --> NFS_OK | | OP03: LOOKUP "this" --> NFS_OK | |
| | | | |
| OP04: GETATTR(fsid) --> NFS_OK | | OP04: GETATTR(fsid) --> NFS_OK | |
| | | | |
|
| - Get current fsid to see where fs boundaries are. The fsid will be | | - Get current fsid to see where filesystem boundaries are. The fsid | |
| that for the pseudo-fs in this example, so no boundary. | | will be that for the pseudo-fs in this example, so no boundary. | |
| | | | |
| OP05: GETFH --> NFS_OK | | OP05: GETFH --> NFS_OK | |
| | | | |
| - Current fh is for /this and is within pseudo-fs. | | - Current fh is for /this and is within pseudo-fs. | |
| | | | |
| OP06: LOOKUP "is" --> NFS_OK | | OP06: LOOKUP "is" --> NFS_OK | |
|
| | | | |
| - Current fh is for /this/is and is within pseudo-fs. | | - Current fh is for /this/is and is within pseudo-fs. | |
| | | | |
| OP07: GETATTR(fsid) --> NFS_OK | | OP07: GETATTR(fsid) --> NFS_OK | |
|
| - Get current fsid to see where fs boundaries are. The fsid will be | | | |
| that for the pseudo-fs in this example, so no boundary. | | - Get current fsid to see where filesystem boundaries are. The fsid | |
| | | will be that for the pseudo-fs in this example, so no boundary. | |
| | | | |
| OP08: GETFH --> NFS_OK | | OP08: GETFH --> NFS_OK | |
| | | | |
| - Current fh is for /this/is and is within pseudo-fs. | | - Current fh is for /this/is and is within pseudo-fs. | |
| | | | |
| OP09: LOOKUP "the" --> NFS_OK | | OP09: LOOKUP "the" --> NFS_OK | |
| | | | |
| - Current fh is for /this/is/the and is within pseudo-fs. | | - Current fh is for /this/is/the and is within pseudo-fs. | |
| | | | |
| OP10: GETATTR(fsid) --> NFS_OK | | OP10: GETATTR(fsid) --> NFS_OK | |
| | | | |
|
| - Get current fsid to see where fs boundaries are. The fsid will be | | - Get current fsid to see where filesystem boundaries are. The fsid | |
| that for the pseudo-fs in this example, so no boundary. | | will be that for the pseudo-fs in this example, so no boundary. | |
| | | | |
| OP11: GETFH --> NFS_OK | | OP11: GETFH --> NFS_OK | |
| | | | |
| - Current fh is for /this/is/the and is within pseudo-fs. | | - Current fh is for /this/is/the and is within pseudo-fs. | |
| | | | |
| OP12: LOOKUP "path" --> NFS_OK | | OP12: LOOKUP "path" --> NFS_OK | |
| | | | |
| - Current fh is for /this/is/the/path and is within a new, absent | | - Current fh is for /this/is/the/path and is within a new, absent | |
|
| fs, but ... | | filesystem, but ... | |
| | | | |
| - The client will never see the value of that fh | | - The client will never see the value of that fh | |
| | | | |
| OP13: GETATTR(fsid, fs_locations_info) --> NFS_OK | | OP13: GETATTR(fsid, fs_locations_info) --> NFS_OK | |
| | | | |
|
| - We are getting the fsid to know where the fs boundaries are. Note | | - We are getting the fsid to know where the filesystem boundaries | |
| that the fsid we are given will not necessarily be preserved at | | are. Note that the fsid we are given will not necessarily be | |
| the new location. That fsid might be different and in fact the | | preserved at the new location. That fsid might be different and | |
| fsid we have for this fs might a valid fsid of a different fs on | | in fact the fsid we have for this filesystem might be a valid fsid | |
| that new server. | | of a different filesystem on that new server. | |
| | | | |
| - In this particular case, we are pretty sure anyway that what has | | - In this particular case, we are pretty sure anyway that what has | |
| moved is /this/is/the/path rather than /this/is/the since we have | | moved is /this/is/the/path rather than /this/is/the since we have | |
| the fsid of the latter and it is that of the pseudo-fs, which | | the fsid of the latter and it is that of the pseudo-fs, which | |
| presumably cannot move. However, in other examples, we might not | | presumably cannot move. However, in other examples, we might not | |
| have this kind of information to rely on (e.g. /this/is/the might | | have this kind of information to rely on (e.g. /this/is/the might | |
| be a non-pseudo file system separate from /this/is/the/path), so | | be a non-pseudo file system separate from /this/is/the/path), so | |
| we need to have another reliable source information on the | | we need to have another reliable source information on the | |
| boundary of the fs which is moved. If, for example, the file | | boundary of the fs which is moved. If, for example, the file | |
| system "/this/is" had moved we would have a case of migration | | system "/this/is" had moved we would have a case of migration | |
| | | | |
| skipping to change at page 220, line 13 | | skipping to change at page 225, line 15 | |
| system was clear we could fetch fs_locations_info. | | system was clear we could fetch fs_locations_info. | |
| | | | |
| - We are fetching fs_locations_info because the fact that we got an | | - We are fetching fs_locations_info because the fact that we got an | |
| NFS4ERR_MOVED at this point means that it most likely that this is | | NFS4ERR_MOVED at this point means that it most likely that this is | |
| a referral and we need the destination. Even if it is the case | | a referral and we need the destination. Even if it is the case | |
| that "/this/is/the" is a file system which has migrated, we will | | that "/this/is/the" is a file system which has migrated, we will | |
| still need the location information for that file system. | | still need the location information for that file system. | |
| | | | |
| OP14: GETFH --> NFS4ERR_MOVED | | OP14: GETFH --> NFS4ERR_MOVED | |
| | | | |
|
| - Fails because current fh is in an absent fs at the start of the | | - Fails because current fh is in an absent filesystem at the start | |
| operation and the spec makes no exception for GETFH. Note that | | of the operation and the spec makes no exception for GETFH. Note | |
| this has the happy consequence that we don't have to worry about | | that this means the server will never send the client a filehandle | |
| the volatility or lack thereof of the fh. If the root of the fs | | from within an absent filesystem. | |
| on the new location is a persistent fh, then we can assume that | | | |
| this fh, which we never saw is a persistent fh, which, if we could | | | |
| see it, would exactly match the new fh. At least, there is no | | | |
| evidence to disprove that. On the other hand, if we find a | | | |
| volatile root at the new location, then the filehandle which we | | | |
| never saw must have been volatile or at least nobody can prove | | | |
| otherwise. | | | |
| | | | |
| Given the above, the client knows where the root of the absent file | | Given the above, the client knows where the root of the absent file | |
| system is, by noting where the change of fsid occurred. The | | system is, by noting where the change of fsid occurred. The | |
| fs_locations_info attribute also gives the client the actual location | | fs_locations_info attribute also gives the client the actual location | |
| of the absent file system, so that the referral can proceed. The | | of the absent file system, so that the referral can proceed. The | |
| server gives the client the bare minimum of information about the | | server gives the client the bare minimum of information about the | |
| absent file system so that there will be very little scope for | | absent file system so that there will be very little scope for | |
| problems of conflict between information sent by the referring server | | problems of conflict between information sent by the referring server | |
| and information of the file system's home. No filehandles and very | | and information of the file system's home. No filehandles and very | |
| few attributes are present on the referring server and the client can | | few attributes are present on the referring server and the client can | |
| | | | |
| skipping to change at page 221, line 4 | | skipping to change at page 225, line 47 | |
| | | | |
| Suppose such a directory is read as follows: | | Suppose such a directory is read as follows: | |
| | | | |
| o PUTROOTFH | | o PUTROOTFH | |
| | | | |
| o LOOKUP "this" | | o LOOKUP "this" | |
| | | | |
| o LOOKUP "is" | | o LOOKUP "is" | |
| | | | |
| o LOOKUP "the" | | o LOOKUP "the" | |
|
| o READDIR (fsid, size, ctime, mounted_on_fileid) | | | |
| | | o READDIR (fsid, size, time_modify, mounted_on_fileid) | |
| | | | |
| In this case, because rdattr_error is not requested, | | In this case, because rdattr_error is not requested, | |
| fs_locations_info is not requested, and some of attributes cannot be | | fs_locations_info is not requested, and some of attributes cannot be | |
|
| provided the result will be an NFS4ERR_MOVED error on the READDIR, | | provided, the result will be an NFS4ERR_MOVED error on the READDIR, | |
| with the detailed results as follows: | | with the detailed results as follows: | |
| | | | |
| o PUTROOTFH --> NFS_OK. The current fh is at the root of the | | o PUTROOTFH --> NFS_OK. The current fh is at the root of the | |
| pseudo-fs. | | pseudo-fs. | |
| | | | |
| o LOOKUP "this" --> NFS_OK. The current fh is for /this and is | | o LOOKUP "this" --> NFS_OK. The current fh is for /this and is | |
| within the pseudo-fs. | | within the pseudo-fs. | |
| | | | |
| o LOOKUP "is" --> NFS_OK. The current fh is for /this/is and is | | o LOOKUP "is" --> NFS_OK. The current fh is for /this/is and is | |
| within the pseudo-fs. | | within the pseudo-fs. | |
| | | | |
| o LOOKUP "the" --> NFS_OK. The current fh is for /this/is/the and | | o LOOKUP "the" --> NFS_OK. The current fh is for /this/is/the and | |
| is within the pseudo-fs. | | is within the pseudo-fs. | |
| | | | |
|
| o READDIR (fsid, size, ctime, mounted_on_fileid) --> NFS4ERR_MOVED. | | o READDIR (fsid, size, time_modify, mounted_on_fileid) --> | |
| Note that the same error would have been returned if /this/is/the | | NFS4ERR_MOVED. Note that the same error would have been returned | |
| had migrated, when in fact it is because the directory contains | | if /this/is/the had migrated, when in fact it is because the | |
| the root of an absent fs. | | directory contains the root of an absent filesystem. | |
| | | | |
| So now suppose that we reissue with rdattr_error: | | So now suppose that we reissue with rdattr_error: | |
| | | | |
| o PUTROOTFH | | o PUTROOTFH | |
| | | | |
| o LOOKUP "this" | | o LOOKUP "this" | |
| | | | |
| o LOOKUP "is" | | o LOOKUP "is" | |
| | | | |
| o LOOKUP "the" | | o LOOKUP "the" | |
| | | | |
|
| o READDIR (rdattr_error, fsid, size, ctime, mounted_on_fileid) | | o READDIR (rdattr_error, fsid, size, time_modify, mounted_on_fileid) | |
| | | | |
| The results will be: | | The results will be: | |
| | | | |
| o PUTROOTFH --> NFS_OK. The current fh is at the root of the | | o PUTROOTFH --> NFS_OK. The current fh is at the root of the | |
| pseudo-fs. | | pseudo-fs. | |
| | | | |
| o LOOKUP "this" --> NFS_OK. The current fh is for /this and is | | o LOOKUP "this" --> NFS_OK. The current fh is for /this and is | |
| within the pseudo-fs. | | within the pseudo-fs. | |
| | | | |
| o LOOKUP "is" --> NFS_OK. The current fh is for /this/is and is | | o LOOKUP "is" --> NFS_OK. The current fh is for /this/is and is | |
| within the pseudo-fs. | | within the pseudo-fs. | |
| | | | |
| o LOOKUP "the" --> NFS_OK. The current fh is for /this/is/the and | | o LOOKUP "the" --> NFS_OK. The current fh is for /this/is/the and | |
| is within the pseudo-fs. | | is within the pseudo-fs. | |
| | | | |
|
| o READDIR (rdattr_error, fsid, size, ctime, mounted_on_fileid) --> | | o READDIR (rdattr_error, fsid, size, time_modify, mounted_on_fileid) | |
| NFS_OK. The attributes for "path" will only contain rdattr_error | | --> NFS_OK. The attributes for "path" will only contain | |
| with the value will be NFS4ERR_MOVED, together with an fsid value | | rdattr_error with the value NFS4ERR_MOVED, together with an fsid | |
| and an a value for mounted_on_fileid. | | value and a value for mounted_on_fileid. | |
| | | | |
| So suppose we do another READDIR to get fs_locations_info, although | | So suppose we do another READDIR to get fs_locations_info, although | |
| we could have used a GETATTR directly, as in the previous section. | | we could have used a GETATTR directly, as in the previous section. | |
| | | | |
| o PUTROOTFH | | o PUTROOTFH | |
| | | | |
| o LOOKUP "this" | | o LOOKUP "this" | |
| | | | |
| o LOOKUP "is" | | o LOOKUP "is" | |
| | | | |
| o LOOKUP "the" | | o LOOKUP "the" | |
| | | | |
| o READDIR (rdattr_error, fs_locations_info, mounted_on_fileid, fsid, | | o READDIR (rdattr_error, fs_locations_info, mounted_on_fileid, fsid, | |
|
| size, ctime) | | size, time_modify) | |
| | | | |
| The results would be: | | The results would be: | |
| | | | |
| o PUTROOTFH --> NFS_OK. The current fh is at the root of the | | o PUTROOTFH --> NFS_OK. The current fh is at the root of the | |
| pseudo-fs. | | pseudo-fs. | |
| | | | |
| o LOOKUP "this" --> NFS_OK. The current fh is for /this and is | | o LOOKUP "this" --> NFS_OK. The current fh is for /this and is | |
| within the pseudo-fs. | | within the pseudo-fs. | |
| | | | |
| o LOOKUP "is" --> NFS_OK. The current fh is for /this/is and is | | o LOOKUP "is" --> NFS_OK. The current fh is for /this/is and is | |
| within the pseudo-fs. | | within the pseudo-fs. | |
| | | | |
| o LOOKUP "the" --> NFS_OK. The current fh is for /this/is/the and | | o LOOKUP "the" --> NFS_OK. The current fh is for /this/is/the and | |
| is within the pseudo-fs. | | is within the pseudo-fs. | |
| | | | |
| o READDIR (rdattr_error, fs_locations_info, mounted_on_fileid, fsid, | | o READDIR (rdattr_error, fs_locations_info, mounted_on_fileid, fsid, | |
|
| size, ctime) --> NFS_OK. The attributes will be as shown below. | | size, time_modify) --> NFS_OK. The attributes will be as shown | |
| | | below. | |
| | | | |
| The attributes for "path" will only contain | | The attributes for "path" will only contain | |
| | | | |
|
| o rdattr_error (value: NFS4ERR_MOVED) | | o rdattr_error (value: NFS_OK) | |
| | | | |
|
| o fs_locations_info ) | | o fs_locations_info | |
| | | | |
| o mounted_on_fileid (value: unique fileid within referring fs) | | o mounted_on_fileid (value: unique fileid within referring fs) | |
|
| o fsid (value: unique value within referring server) | | | |
| | | | |
| The attribute entry for "latest" will not contain size or ctime. | | | |
| | | | |
| 11.8. The Attribute fs_absent | | | |
| | | | |
|
| In order to provide the client information about whether the current | | o fsid (value: unique value within referring server) | |
| file system is present or absent, the fs_absent attribute may be | | | |
| interrogated. | | | |
| | | | |
|
| As noted above, this attribute, when supported, may be requested of | | The attribute entry for "path" will not contain size or time_modify | |
| absent file systems without causing NFS4ERR_MOVED to be returned and | | because these attributes are not available within an absent | |
| it should always be available. Servers are strongly urged to support | | filesystem. | |
| this attribute on all file systems if they support it on any file | | | |
| system. | | | |
| | | | |
|
| 11.9. The Attribute fs_locations | | 11.8. The Attribute fs_locations | |
| | | | |
| The fs_locations attribute is structured in the following way: | | The fs_locations attribute is structured in the following way: | |
| | | | |
| struct fs_location { | | struct fs_location { | |
| utf8str_cis server<>; | | utf8str_cis server<>; | |
| pathname4 rootpath; | | pathname4 rootpath; | |
| }; | | }; | |
| | | | |
| struct fs_locations { | | struct fs_locations { | |
| pathname4 fs_root; | | pathname4 fs_root; | |
| fs_location locations<>; | | fs_location locations<>; | |
| }; | | }; | |
| | | | |
| The fs_location struct is used to represent the location of a file | | The fs_location struct is used to represent the location of a file | |
| system by providing a server name and the path to the root of the | | system by providing a server name and the path to the root of the | |
| file system within that server's namespace. When a set of servers | | file system within that server's namespace. When a set of servers | |
| have corresponding file systems at the same path within their | | have corresponding file systems at the same path within their | |
| namespaces, an array of server names may be provided. An entry in | | namespaces, an array of server names may be provided. An entry in | |
|
| the server array is an UTF8 string and represents one of a | | the server array is a UTF8 string and represents one of a traditional | |
| traditional DNS host name, IPv4 address, or IPv6 address. It is not | | DNS host name, IPv4 address, or IPv6 address, or an zero-length | |
| a requirement that all servers that share the same rootpath be listed | | string. A null string SHOULD be used to indicate the current address | |
| in one fs_location struct. The array of server names is provided for | | being used for the RPC call. It is not a requirement that all | |
| convenience. Servers that share the same rootpath may also be listed | | servers that share the same rootpath be listed in one fs_location | |
| in separate fs_location entries in the fs_locations attribute. | | struct. The array of server names is provided for convenience. | |
| | | Servers that share the same rootpath may also be listed in separate | |
| | | fs_location entries in the fs_locations attribute. | |
| | | | |
| The fs_locations struct and attribute contains an array of such | | The fs_locations struct and attribute contains an array of such | |
| locations. Since the namespace of each server may be constructed | | locations. Since the namespace of each server may be constructed | |
| differently, the "fs_root" field is provided. The path represented | | differently, the "fs_root" field is provided. The path represented | |
| by fs_root represents the location of the file system in the current | | by fs_root represents the location of the file system in the current | |
| server's namespace, i.e. that of the server from which the | | server's namespace, i.e. that of the server from which the | |
| fs_locations attribute was obtained. The fs_root path is meant to | | fs_locations attribute was obtained. The fs_root path is meant to | |
| aid the client by clearly referencing the root of the file system | | aid the client by clearly referencing the root of the file system | |
| whose locations are being reported, no matter what object within the | | whose locations are being reported, no matter what object within the | |
|
| current file system, the current filehandle designates. | | current file system the current filehandle designates. When the | |
| | | fs_locations attribute is interrogated and there are no alternate | |
| | | file system locations, the server SHOULD return a zero-length array | |
| | | of fs_location structures, together with a valid fs_root. | |
| | | | |
| As an example, suppose there is a replicated file system located at | | As an example, suppose there is a replicated file system located at | |
| two servers (servA and servB). At servA, the file system is located | | two servers (servA and servB). At servA, the file system is located | |
| at path "/a/b/c". At, servB the file system is located at path | | at path "/a/b/c". At, servB the file system is located at path | |
| "/x/y/z". If the client were to obtain the fs_locations value for | | "/x/y/z". If the client were to obtain the fs_locations value for | |
| the directory at "/a/b/c/d", it might not necessarily know that the | | the directory at "/a/b/c/d", it might not necessarily know that the | |
| file system's root is located in servA's namespace at "/a/b/c". When | | file system's root is located in servA's namespace at "/a/b/c". When | |
| the client switches to servB, it will need to determine that the | | the client switches to servB, it will need to determine that the | |
| directory it first referenced at servA is now represented by the path | | directory it first referenced at servA is now represented by the path | |
| "/x/y/z/d" on servB. To facilitate this, the fs_locations attribute | | "/x/y/z/d" on servB. To facilitate this, the fs_locations attribute | |
| provided by servA would have a fs_root value of "/a/b/c" and two | | provided by servA would have a fs_root value of "/a/b/c" and two | |
| entries in fs_locations. One entry in fs_locations will be for | | entries in fs_locations. One entry in fs_locations will be for | |
| itself (servA) and the other will be for servB with a path of | | itself (servA) and the other will be for servB with a path of | |
| "/x/y/z". With this information, the client is able to substitute | | "/x/y/z". With this information, the client is able to substitute | |
| "/x/y/z" for the "/a/b/c" at the beginning of its access path and | | "/x/y/z" for the "/a/b/c" at the beginning of its access path and | |
| construct "/x/y/z/d" to use for the new server. | | construct "/x/y/z/d" to use for the new server. | |
| | | | |
| Since fs_locations attribute lacks information defining various | | Since fs_locations attribute lacks information defining various | |
|
| attributes of the various file system choices presented, it should | | attributes of the various file system choices presented, it SHOULD | |
| only be interrogated and used when fs_locations_info is not | | only be interrogated and used when fs_locations_info is not | |
| available. When fs_locations is used, information about the specific | | available. When fs_locations is used, information about the specific | |
| locations should be assumed based on the following rules. | | locations should be assumed based on the following rules. | |
| | | | |
| The following rules are general and apply irrespective of the | | The following rules are general and apply irrespective of the | |
| context. | | context. | |
| | | | |
| o All listed file system instances should be considered as of the | | o All listed file system instances should be considered as of the | |
| same _handle_ class, if and only if, the current fh_expire_type | | same _handle_ class, if and only if, the current fh_expire_type | |
| attribute does not include the FH4_VOL_MIGRATION bit. Note that | | attribute does not include the FH4_VOL_MIGRATION bit. Note that | |
| | | | |
| skipping to change at page 225, line 5 | | skipping to change at page 229, line 41 | |
| same _fileid_ class, if and only if, the fh_expire_type attribute | | same _fileid_ class, if and only if, the fh_expire_type attribute | |
| indicates persistent filehandles and does not include the | | indicates persistent filehandles and does not include the | |
| FH4_VOL_MIGRATION bit. Note that in the case of referral, fileid | | FH4_VOL_MIGRATION bit. Note that in the case of referral, fileid | |
| issues do not apply since there can be no fileids known within the | | issues do not apply since there can be no fileids known within the | |
| referring (absent) file system nor is there any access to the | | referring (absent) file system nor is there any access to the | |
| fh_expire_type attribute. | | fh_expire_type attribute. | |
| | | | |
| o All file system instances servers should be considered as of | | o All file system instances servers should be considered as of | |
| different _change_ classes. | | different _change_ classes. | |
| | | | |
|
| For other class assignments, handling depends of file system | | For other class assignments, handling of file system transitions | |
| transitions depends on the reasons for the transition: | | depends on the reasons for the transition: | |
| | | | |
|
| o When the transition is due to migration, the target should be | | o When the transition is due to migration, that is the client was | |
| treated as being of the same _verifier_ class as the source. | | directed to new filesystem after receiving a NFS4ERR_MOVED error, | |
| | | the target should be treated as being of the same _verifier_ class | |
| | | as the source. | |
| | | | |
|
| o When the transition is due to failover to another replica, the | | o When the transition is due to failover to another replica, that | |
| target should be treated as being of a different _verifier_ class | | is, the client selected another replica without receiving and | |
| from the source. | | NFS4ERR_MOVED error, the target should be treated as being of a | |
| | | different _verifier_ class from the source. | |
| | | | |
| The specific choices reflect typical implementation patterns for | | The specific choices reflect typical implementation patterns for | |
| failover and controlled migration respectively. Since other choices | | failover and controlled migration respectively. Since other choices | |
| are possible and useful, this information is better obtained by using | | are possible and useful, this information is better obtained by using | |
|
| fs_locations_info. | | fs_locations_info. When a server implementation needs to communicate | |
| | | other choices, it MUST support the fs_locations_info attribute. | |
| | | | |
| See the section "Security Considerations" for a discussion on the | | See the section "Security Considerations" for a discussion on the | |
| recommendations for the security flavor to be used by any GETATTR | | recommendations for the security flavor to be used by any GETATTR | |
| operation that requests the "fs_locations" attribute. | | operation that requests the "fs_locations" attribute. | |
| | | | |
|
| 11.10. The Attribute fs_locations_info | | 11.9. The Attribute fs_locations_info | |
| | | | |
| The fs_locations_info attribute is intended as a more functional | | The fs_locations_info attribute is intended as a more functional | |
| replacement for fs_locations which will continue to exist and be | | replacement for fs_locations which will continue to exist and be | |
|
| supported. Clients can use it get a more complete set of information | | supported. Clients can use it to get a more complete set of | |
| about alternative file system locations. When the server does not | | information about alternative file system locations. When the server | |
| support fs_locations_info, fs_locations can be used to get a subset | | does not support fs_locations_info, fs_locations can be used to get a | |
| of the information. A server which supports fs_locations_info MUST | | subset of the information. A server which supports fs_locations_info | |
| support fs_locations as well. | | MUST support fs_locations as well. | |
| | | | |
| There is additional information present in fs_locations_info, that is | | There is additional information present in fs_locations_info, that is | |
| not available in fs_locations: | | not available in fs_locations: | |
| | | | |
| o Attribute continuity information to allow a client to select a | | o Attribute continuity information to allow a client to select a | |
| location which meets the transparency requirements of the | | location which meets the transparency requirements of the | |
| applications accessing the data and to take advantage of | | applications accessing the data and to take advantage of | |
| optimizations that server guarantees as to attribute continuity | | optimizations that server guarantees as to attribute continuity | |
| may provide (e.g. change attribute). | | may provide (e.g. change attribute). | |
| | | | |
| o File System identity information which indicates when multiple | | o File System identity information which indicates when multiple | |
|
| replicas, from the clients point of view, correspond to the same | | replicas, from the client's point of view, correspond to the same | |
| target file system, allowing them to be used interchangeably, | | target file system, allowing them to be used interchangeably, | |
| without disruption, as multiple paths to the same thing. | | without disruption, as multiple paths to the same thing. | |
| | | | |
| o Information which will bear on the suitability of various | | o Information which will bear on the suitability of various | |
| replicas, depending on the use that the client intends. For | | replicas, depending on the use that the client intends. For | |
| example, many applications need an absolutely up-to-date copy | | example, many applications need an absolutely up-to-date copy | |
| (e.g. those that write), while others may only need access to the | | (e.g. those that write), while others may only need access to the | |
| most up-to-date copy reasonably available. | | most up-to-date copy reasonably available. | |
| | | | |
| o Server-derived preference information for replicas, which can be | | o Server-derived preference information for replicas, which can be | |
| used to implement load-balancing while giving the client the | | used to implement load-balancing while giving the client the | |
|
| entire fs list to be used in case the primary fails. | | entire filesystem list to be used in case the primary fails. | |
| | | | |
| | | The fs_locations_info attribute is structured similarly to the | |
| | | fs_locations attribute. A top-level structure (fs_locations_info4) | |
| | | contains the entire attribute including the root pathname of the | |
| | | filesystem and an array of lower-level structures that define | |
| | | replicas that share a common root path on their respective servers. | |
| | | The lower-level structure in turn (fs_locations_item4) contains a | |
| | | specific pathname and information on one or more individual server | |
| | | replicas. For that last lowest-level fs_locations_info has a | |
| | | fs_locations_server4 structure that contains per-server-replica | |
| | | information in addition to the server name. This per-server-replica | |
| | | information includes a nominally opaque array, fls_info, in which | |
| | | specific pieces of information are located at the specific indices | |
| | | listed below. | |
| | | | |
| | | The attribute will always contains at least a single | |
| | | fs_locations_server entry. Typically, this will be an entry with the | |
| | | FS4LIGF_CUR_REQ flag set, although in the case of a referral there | |
| | | will be no entry with that flag set. | |
| | | | |
| | | It should be noted that fs_locations_info attributes returned by | |
| | | servers for various replicas may different for various reasons. One | |
| | | server may know about a set of replicas that are not know to other | |
| | | servers. Further, compatibility attributes may differ. Filehandles | |
| | | may by of the same class going from replica A to replica B but not | |
| | | going in the reverse direction. This may happen because the | |
| | | filehandles are the same but the server implementation for the server | |
| | | on which replica B may not have provision to note and report that | |
| | | equivalence. | |
| | | | |
| The fs_locations_info attribute consists of a root pathname (just | | The fs_locations_info attribute consists of a root pathname (just | |
| like fs_locations), together with an array of fs_location_item4 | | like fs_locations), together with an array of fs_location_item4 | |
|
| structures. | | structures. The fs_location_item4 structures in turn consist of a | |
| | | root pathanme together with an array of | |
| | | | |
|
| | | /* | |
| | | * Defines an individual server replica | |
| | | */ | |
| struct fs_locations_server4 { | | struct fs_locations_server4 { | |
| int32_t fls_currency; | | int32_t fls_currency; | |
| opaque fls_info<>; | | opaque fls_info<>; | |
| utf8str_cis fls_server; | | utf8str_cis fls_server; | |
| }; | | }; | |
| | | | |
|
| | | /* | |
| | | * Byte indices of items within fls_info: flag fields, class numbers, | |
| | | * bytes indicating ranks and orders. | |
| | | */ | |
| const FSLI4BX_GFLAGS = 0; | | const FSLI4BX_GFLAGS = 0; | |
| const FSLI4BX_TFLAGS = 1; | | const FSLI4BX_TFLAGS = 1; | |
|
| | | | |
| const FSLI4BX_CLSIMUL = 2; | | const FSLI4BX_CLSIMUL = 2; | |
| const FSLI4BX_CLHANDLE = 3; | | const FSLI4BX_CLHANDLE = 3; | |
| const FSLI4BX_CLFILEID = 4; | | const FSLI4BX_CLFILEID = 4; | |
|
| const FSLI4BX_CLVERIFIER = 5; | | const FSLI4BX_CLWRITEVER = 5; | |
| const FSLI4BX_CHANGE = 6; | | const FSLI4BX_CLCHANGE = 6; | |
| | | const FSLI4BX_CLREADDIR = 7; | |
| | | | |
|
| const FSLI4BX_READRANK = 7; | | const FSLI4BX_READRANK = 8; | |
| const FSLI4BX_WRITERANK = 8; | | const FSLI4BX_WRITERANK = 9; | |
| const FSLI4BX_READORDER = 9; | | const FSLI4BX_READORDER = 10; | |
| const FSLI4BX_WRITEORDER = 10; | | const FSLI4BX_WRITEORDER = 11; | |
| | | | |
|
| | | /* | |
| | | * Bits defined within the general flag byte. | |
| | | */ | |
| const FSLI4GF_WRITABLE = 0x01; | | const FSLI4GF_WRITABLE = 0x01; | |
| const FSLI4GF_CUR_REQ = 0x02; | | const FSLI4GF_CUR_REQ = 0x02; | |
| const FSLI4GF_ABSENT = 0x04; | | const FSLI4GF_ABSENT = 0x04; | |
| const FSLI4GF_GOING = 0x08; | | const FSLI4GF_GOING = 0x08; | |
| const FSLI4GF_SPLIT = 0x10; | | const FSLI4GF_SPLIT = 0x10; | |
| | | | |
|
| | | /* | |
| | | * Bits defined within the transport flag byte. | |
| | | */ | |
| const FSLI4TF_RDMA = 0x01; | | const FSLI4TF_RDMA = 0x01; | |
| | | | |
|
| | | /* | |
| | | * Defines a set of replicas sharing a common value of the root | |
| | | * path with in the corresponding single-server namespaces. | |
| | | */ | |
| struct fs_locations_item4 { | | struct fs_locations_item4 { | |
| fs_locations_server4 fli_entries<>; | | fs_locations_server4 fli_entries<>; | |
| pathname4 fli_rootpath; | | pathname4 fli_rootpath; | |
| }; | | }; | |
| | | | |
|
| | | /* | |
| | | * Defines the overall structure of the fs_locations_info attribute. | |
| | | */ | |
| struct fs_locations_info4 { | | struct fs_locations_info4 { | |
| uint32_t fli_flags; | | uint32_t fli_flags; | |
|
| | | int32_t fli_valid_for; | |
| pathname4 fli_fs_root; | | pathname4 fli_fs_root; | |
| fs_locations_item4 fli_items<>; | | fs_locations_item4 fli_items<>; | |
| }; | | }; | |
| | | | |
|
| | | /* | |
| | | * Flag bits in fli_flags. | |
| | | */ | |
| const FSLI4IF_VAR_SUB = 0x00000001; | | const FSLI4IF_VAR_SUB = 0x00000001; | |
| | | | |
| typedef fs_locations_info4 fattr4_fs_locations_info; | | typedef fs_locations_info4 fattr4_fs_locations_info; | |
| | | | |
|
| The fs_locations_info attribute is structured similarly to the | | | |
| fs_locations attribute. A top-level structure (fs_locations_info4) | | | |
| contains the entire attribute including the root pathname of the fs | | | |
| and an array of lower-level structures that define replicas that | | | |
| share a common root path on their respective servers. The lower- | | | |
| level structure in turn ( fs_locations_item4) contain a specific | | | |
| pathname and information on one or more individual server replicas. | | | |
| For that last lowest-level fs_locations_info has a | | | |
| fs_locations_server4 structure that contains per-server-replica | | | |
| information in addition to the server name. | | | |
| | | | |
| As noted above, the fs_locations_info attribute, when supported, may | | As noted above, the fs_locations_info attribute, when supported, may | |
| be requested of absent file systems without causing NFS4ERR_MOVED to | | be requested of absent file systems without causing NFS4ERR_MOVED to | |
| be returned and it is generally expected that it will be available | | be returned and it is generally expected that it will be available | |
| for both present and absent file systems even if only a single | | for both present and absent file systems even if only a single | |
| fs_locations_server4 entry is present, designating the current | | fs_locations_server4 entry is present, designating the current | |
| (present) file system, or two fs_locations_server4 entries | | (present) file system, or two fs_locations_server4 entries | |
|
| designating the current (and now previous) location of an absent file | | designating the previous location of an absent file system (the one | |
| system and its successor location. Servers are strongly urged to | | just referenced) and its successor location. Servers are strongly | |
| support this attribute on all file systems if they support it on any | | urged to support this attribute on all file systems if they support | |
| file system. | | it on any file system. | |
| | | | |
|
| 11.10.1. The fs_locations_server4 Structure | | The data presented in the fs_locations_info attribute may be obtained | |
| | | by the server in any number of ways, including specification by the | |
| | | administrator or by current protocols for transferring data among | |
| | | replicas and protocols not yet developed. NFS version 4.1 only | |
| | | defines how this information is presented by the server to the | |
| | | client. | |
| | | | |
| | | 11.9.1. The fs_locations_server4 Structure | |
| | | | |
| The fs_locations_server4 structure consists of the following items: | | The fs_locations_server4 structure consists of the following items: | |
| | | | |
| o An indication of file system up-to-date-ness (fls_currency) in | | o An indication of file system up-to-date-ness (fls_currency) in | |
|
| terms of approximate seconds before the present. A negative value | | terms of approximate seconds before the present. This value is | |
| indicates that the server is unable to give any reasonably useful | | relative to the master copy. A negative value indicates that the | |
| value here. A zero indicates that file system is the actual | | server is unable to give any reasonably useful value here. A zero | |
| writable data or a reliably coherent and fully up-to-date copy. | | indicates that file system is the actual writable data or a | |
| Positive values indicate how out- of-date this copy can normally | | reliably coherent and fully up-to-date copy. Positive values | |
| be before it is considered for update. Such a value is not a | | indicate how out-of-date this copy can normally be before it is | |
| guarantee that such updates will always be performed on the | | considered for update. Such a value is not a guarantee that such | |
| required schedule but instead serve as a hint about how far behind | | updates will always be performed on the required schedule but | |
| the most up-to-date copy of the data, this copy would normally be | | instead serve as a hint about how far the copy of the data would | |
| expected to be. | | be expected to be behind the most up-to-date copy. | |
| | | | |
| o A counted array of one-octet values (fls_info) containing | | o A counted array of one-octet values (fls_info) containing | |
| information about the particular file system instance. This data | | information about the particular file system instance. This data | |
| includes general flags, transport capability flags, file system | | includes general flags, transport capability flags, file system | |
| equivalence class information, and selection priority information. | | equivalence class information, and selection priority information. | |
| The encoding will be discussed below. | | The encoding will be discussed below. | |
| | | | |
| o The server string (fls_server). For the case of the replica | | o The server string (fls_server). For the case of the replica | |
|
| currently being accessed (via GETATTR), a null string may be used | | currently being accessed (via GETATTR), a null string MAY be used | |
| to indicate the current address being used for the RPC call. | | to indicate the current address being used for the RPC call. | |
| | | | |
|
| Data within the fls_info array, is in the form of 8-bit data items | | Data within the fls_info array is in the form of 8-bit data items | |
| with constants giving the offsets within the array of various values | | with constants giving the offsets within the array of various values | |
| describing this particular file system instance. This style of | | describing this particular file system instance. This style of | |
| definition was chosen, in preference to explicit XDR structure | | definition was chosen, in preference to explicit XDR structure | |
|
| definitions |