draft-ietf-nfsv4-migration-issues-11.txt   draft-ietf-nfsv4-migration-issues-12.txt 
NFSv4 D. Noveck, Ed. NFSv4 D. Noveck, Ed.
Internet-Draft Internet-Draft NetApp
Intended status: Informational P. Shivam Intended status: Informational P. Shivam
Expires: August 12, 2017 C. Lever Expires: September 30, 2017 C. Lever
B. Baker B. Baker
ORACLE ORACLE
February 8, 2017 March 29, 2017
NFSv4 migration: Implementation Experience and Specification Issues NFSv4 migration: Implementation Experience and Specification Issues
draft-ietf-nfsv4-migration-issues-11 draft-ietf-nfsv4-migration-issues-12
Abstract Abstract
The migration feature of NFSv4 provides for moving responsibility for The migration feature of NFSv4 provides for moving responsibility for
a single filesystem from one server to another, without disruption to a single filesystem from one server to another, without disruption to
clients. Implementation experience has shown problems in clients. A number of problems in the specification of this feature
specification for this feature in RFC7530. This document explains in NFSv4.0 were resolved by the publication of RFC 7931. In
the choices made to address these issues by updating the NFSv4.0 addition, there are specification issues to be resolved with regard
specification in RFC7931 and those to be made with regard to the to the NFSv4.1 version of this feature which are discussed in this
NFSv4.1 specification, in order to properly address migration. document.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 12, 2017. This Internet-Draft will expire on September 30, 2017.
Copyright Notice Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. NFSv4.0 Implementation Experience . . . . . . . . . . . . . . 3 3. NFSv4.0 Issues and Their Resolution . . . . . . . . . . . . . 3
3.1. Implementation Issues . . . . . . . . . . . . . . . . . . 3 3.1. NFSv4.0 Issues . . . . . . . . . . . . . . . . . . . . . 3
3.1.1. Failure to Free Migrated State on Client Reboot . . . 4 3.2. Resolution of NFSv4.0 Protocol Difficulties . . . . . . . 4
3.1.2. Server Reboots Resulting in a Confused Lease 4. Issues for NFSv4.1 . . . . . . . . . . . . . . . . . . . . . 5
Situation . . . . . . . . . . . . . . . . . . . . . . 4 4.1. Issues to Address for NFSv4.1 . . . . . . . . . . . . . . 5
3.1.3. Client Complexity Issues . . . . . . . . . . . . . . 5 4.1.1. Addressing state merger in NFSv4.1 . . . . . . . . . 6
3.2. Sources of Protocol Difficulties . . . . . . . . . . . . 7 4.1.2. Addressing pNFS relationship with migration . . . . . 7
3.2.1. Issues with nfs_client_id4 Generation and Use . . . . 7 4.1.3. Addressing server_owner changes in NFSv4.1 . . . . . 7
3.2.2. Issues with Lease Proliferation . . . . . . . . . . . 9 4.1.4. Addressing Confirmation Status of Migrated
4. Resolution of NFSv4.0 Protocol Difficulties . . . . . . . . . 9 Client IDs in NFSv4.1 . . . . . . . . . . . . . . . . 8
4.1. Changes Regarding nfs_client_id4 Client-string . . . . . 9 4.1.5. Addressing Session Migration in NFSv4.1 . . . . . . . 9
4.2. Changes Regarding Merged (vs. Synchronized) Leases . . . 10 4.2. Possible Resolutions for NFSv4.1 Issues . . . . . . . . . 9
4.3. Other Changes to Migration-state Sections . . . . . . . . 11 4.2.1. Server Responsibilities in Effecting Transparent
4.3.1. Changes Regarding Client ID Migration . . . . . . . . 12 State Migration . . . . . . . . . . . . . . . . . . . 10
4.3.2. Changes Regarding Callback Re-establishment . . . . . 12 4.2.2. Determining Initial Migration Status in NFSv4.1 . . . 11
4.3.3. NFS4ERR_LEASE_MOVED Rework . . . . . . . . . . . . . 13 4.2.3. Client Response to Migration in NFSv4.1 . . . . . . . 13
4.4. Changes to Other Sections . . . . . . . . . . . . . . . . 13 4.2.4. Dealing with Multiple Location Entries . . . . . . . 13
4.4.1. Need for Additional Changes . . . . . . . . . . . . . 13 4.2.5. Client Recovery from Migration Events . . . . . . . . 15
4.4.2. Callback Update . . . . . . . . . . . . . . . . . . . 14 4.2.6. The Migration Discovery Process . . . . . . . . . . . 18
4.4.3. clientid4 Handling . . . . . . . . . . . . . . . . . 14 4.2.7. Synchronzing Session Transfer . . . . . . . . . . . . 19
4.4.4. Handling of NFS4ERR_CLID_INUSE . . . . . . . . . . . 16 4.2.8. Migration and pNFS . . . . . . . . . . . . . . . . . 22
5. Issues for NFSv4.1 . . . . . . . . . . . . . . . . . . . . . 17 5. Security Considerations . . . . . . . . . . . . . . . . . . . 23
5.1. Addressing state merger in NFSv4.1 . . . . . . . . . . . 17 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23
5.2. Addressing pNFS relationship with migration . . . . . . . 18 7. Normative References . . . . . . . . . . . . . . . . . . . . 23
5.3. Addressing server owner changes in NFSv4.1 . . . . . . . 18 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 23
6. Security Considerations . . . . . . . . . . . . . . . . . . . 19 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20
9. Normative References . . . . . . . . . . . . . . . . . . . . 20
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20
1. Introduction 1. Introduction
This document is in the informational category, and while the facts This document. which deals with existing issues/problems in
it reports may have normative implications, any such normative standards-track documents, is in the informational category, and
significance reflects the readers' preferences. For example, we may while the facts it reports may have normative implications, any such
report that the reboot of a client with migrated state results in normative significance reflects the readers' preferences. For
state not being promptly cleared and that this will prevent granting example, we may report that the existing definition of migration for
of conflicting lock requests at least for the lease time, which is a NFSv4.1 does not properly describe how migrating state is to be
fact. While it is to be expected that client and server implementers merged with existing state for the destination server. While it is
will judge this to be a situation that is best avoided, the judgment to be expected that client and server implementers will judge this to
as to how pressing this issue should be considered is a judgment for be a situation that is best avoided, the judgment as to how pressing
the reader, and eventually the nfsv4 working group to make. this issue should be considered is a judgment for the reader, and
eventually the nfsv4 working group to make.
We do explore possible ways in which such issues can be avoided, with We do explore possible ways in which such issues can be avoided, with
minimal negative effects, given that the working group has decided to minimal negative effects, given that the working group has decided to
address these issues, but the choice of exactly how to address these address these issues, but the choice of exactly how to address these
is best given effect in one or more standards-track documents and/or is best given effect in one or more standards-track documents and/or
errata. errata.
This document focuses on NFSv4.0, since that is where the majority of This document focuses on NFSv4.1, since the analogous issues for
implementation experience has been. Nevertheless, there is NFSv4.0 have already been addressed by the publication of [RFC7931].
discussion of the implications of the NFSv4.0 experience for Nevertheless, the history of these issues in NFSv4.0 is presented,
migration in NFSv4.1, as well as discussion of other issues with since understanding the similarities and differences between these
regard to the treatment of migration in NFSv4.1. protocols may be helpful in deciding how best to address remaining
issues.
2. Conventions 2. Conventions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
In the context of this informational document, these normative In the context of this informational document, these normative
keywords will always occur in the context of a quotation, most often keywords will always occur in the context of a quotation, most often
direct but sometimes indirect. The context will make it clear direct but sometimes indirect. The context will make it clear
skipping to change at page 3, line 44 skipping to change at page 3, line 42
protocol [RFC7530]. protocol [RFC7530].
o The current definitive definition of the NFSv4.1 protocol o The current definitive definition of the NFSv4.1 protocol
[RFC5661]. [RFC5661].
o A proposed or possible text to serve as a replacement for the o A proposed or possible text to serve as a replacement for the
current or previous definitive document text. Sometimes, a number current or previous definitive document text. Sometimes, a number
of possible alternative texts may be listed and benefits and of possible alternative texts may be listed and benefits and
detriments of each examined in turn. detriments of each examined in turn.
3. NFSv4.0 Implementation Experience 3. NFSv4.0 Issues and Their Resolution
3.1. Implementation Issues 3.1. NFSv4.0 Issues
Note that the examples below reflect current experience which arises Many of the problems seen with Transparent State Migration derived
from clients implementing the recommendation to use different from the inability of servers to determine whether two client IDs,
nfs_client_id4 id strings for different server addresses, i.e. using issued on different servers, corresponded to the same client. This
what is later referred to herein as the "non-uniform client-string difficulty derived in turn from the common practice, recommended by
approach." [RFC7530], in which each client presented different client
identification strings to different servers, rather than presenting
the same identification string to all servers.
This is simply because that is the experience implementers have had. This practice, later referred to as the "non-uniform" client string
The reader should not assume that in all cases, this practice is the approach, derived from concern that, since NFSv4.0 provided no means
source of the difficulty. It may be so in some cases but clearly it to determine whether two IP addresses correspond to the server, a
is not in all cases. single client connected to both might be confused by the fact that
state changes made via one IP address might unexpectedly affect the
state maintained with respect to the second IP address, thought of as
a separate server
3.1.1. Failure to Free Migrated State on Client Reboot To avoid this unexpected behavior, clients used the non-uniform
client id string approach. By doing so, a client connected to two
different servers (or to two IP addresses connected to the same
server) appeared to be two different servers. Since the server is
under the impression that two different clients are involved, state
changes made on each distinct IP address cannot be reflected on
another.
The following sort of situation has proved troublesome: However, by doing things this way, state migrated from server to
server cannot be referred to the actual client which generated it,
leading to confusion.
o A client C establishes a clientid4 C1 with server ABC specifying In addition to this core problem, the following issues with regard to
an nfs_client_id4 with id string value "C-ABC" and boot verifier Transparent State Migration needed to be addressed:
0x111.
o The client begins to access files in filesystem F on server ABC, o Clarification regarding the ability to merge state from different
resulting in generating stateids S1, S2, etc. under the lease for leases even though their expiration times might not be precisely
clientid C1. It may also access files on other filesystems on the synchronized.
same server.
o The filesystem is migrated from server ABC to server XYZ. When o Clarifying the treatment of client IDs since it is not always
transparent state migration is in effect, stateids S1 and S2 and clear when clientid4 and when nfs_client_id4 was intended.
clientid4 C1 are now available for use by client C at server XYZ.
o Client C reboots and attempts to access data on server XYZ, o Clarifying the logic of returning NFS4ERR_LEASE_MOVED.
whether in filesystem F or another. It does a SETCLIENTID with an
nfs_client_id4 with id string value "C-XYZ" and boot verifier
0x112. There is thus no occasion to free stateids S1 and S2 since
they are associated with a different client name and so lease
expiration is the only way that they can be gotten rid of.
Note here that while it seems clear to us in this example that C-XYZ o Clarifying the handling NFS4ERR_CLID_INUSE.
and C-ABC are from the same client, the server has no way to
determine the structure of the "opaque" id string. In the protocol,
it really is treated as opaque. Only the client knows which
nfs_client_id4 values designate the same client on a different
server.
3.1.2. Server Reboots Resulting in a Confused Lease Situation 3.2. Resolution of NFSv4.0 Protocol Difficulties
Further problems arise from scenarios like the following. The client string identification issue was addressed in [RFC7931] as
follows:
o Client C talks to server ABC using an nfs_client_id4 id string o Defining both the uniform and non-uniform client id string
such as "C-ABC" and a boot verifier v1. As a result, a lease with approaches as valid choices but indicating that the latter posed
clientid4 c.i is established: {v1, "C-ABC", c.i}. difficulties for Transparent Stare Migration.
o fs_a1 migrates from server ABC to server XYZ along with its state. o Providing a way that clients could use to determine whether two IP
Now server XYZ also has a lease: {v1, "C-ABC", c.i}. addresses are connected to the same server.
o Server ABC reboots. o Allowing clients using the uniform approach to avoid negative
consequences due to otherwise unexpected behavior since behavior
that is a consequence of known trunking relationships is not
unexpected.
o Client C talks to server ABC using an nfs_client_id4 id string o As a result, servers migrating state are aware of the fact that
such as "C-ABC" and a boot verifier v1. As a result, a lease with the same client is associated with two different items of state
clientid4 c.j is established: {v1, "C-ABC", c.j}. even when that state was originally created on two different
servers.
o fs_a2 migrates from server ABC to server XYZ. Now server XYZ also Since all of the other issues noted in Section 3.1 were also
has a lease: {v1, "C-ABC", c.j}. addressed, publication of [RFC7931] updating [RFC7530] addressed all
known issues with Transparent State Migration in NFSv4.0.
o Now server XYZ has two leases that match {v1, "C-ABC", *}, when 4. Issues for NFSv4.1
the protocol clearly assumes there can be only one.
Note that if the client used "C" (rather than "C-ABC") as the 4.1. Issues to Address for NFSv4.1
nfs_client_id4 id string, the exact same situation would arise.
One of the first cases in which this sort of situation has resulted Because NFSv4.1 embraces the uniform client-string approach, as
in difficulties is in connection with doing a SETCLIENTID for advised by section 2.4 of [RFC5661], addressing migration issues is
callback update. simpler, in that a shift in client id string models is not required.
Instead, NFSv4 returns information in the EXCHANGE_ID response to
enable trunking relationships to be determined by the client.
The SETCLIENTID for callback update only includes the nfs_client_id4, The other necessary part of addressing migration issues, providing
assuming there can only be one such with a given nfs_client_id4 for the server's merger of leases that relate to the same client, is
value. If there were multiple, confirmed client records with not currently addressed by [RFC5661] and changes need to be made to
identical nfs_client_id4 id string values, there would be no way to make it clear that state needs to be appropriately merged as part of
map the callback update request to the correct client record. Apart migration, to avoid multiple client IDs between a client-server pair.
from the migration handling specified in [RFC7530], such a situation
cannot arise.
3.1.3. Client Complexity Issues In addition, there are a number of new features within NFSv4.1 whose
relationship with migration needs to be clarified. Some examples:
Consider the following situation: o The interaction of trunking with migration and other aspects of
multi-server namespace needs to be clarified.
o There are a set of clients C1 through Cn accessing servers S1 o There needs to be some clarification of how migration, and
through Sm. Each server manages some significant number of particularly Transparent State Migration, should interact with
filesystems with the filesystem count L being significantly pNFS layouts.
greater than m.
o Each client Cx will access a subset of the servers and so will o The current discussion (in [RFC5661]), of the possibility of
have up to m clientids, which we will call Cxy for server Sy. server_owner changes is incomplete and confusing.
o Now assume that for load-balancing or other operational reasons, o The expected confirmation status of client IDs transferred by
numbers of filesystems are migrated among the servers. As a Transparent State Migration needs to be clarified.
result, each client-server pair will have up to m clientids and
each client will have up to m**2 clientids. If we add the
possibility of server reboot, the only bound on a client's
clientid count is L.
Now, instead of a clientid4 identifying a client-server pair, we have o There are a number of issues related to the migration of sessions
many more entities for the client to deal with. In addition, it that need to be addressed
isn't clear how new state is to be incorporated in this structure.
The limitations of the migrated state (inability to be freed on Discussion of how to resolve these issues will appear in the sections
reboot) would argue against adding more such state but trying to below.
avoid that would run into its own difficulties. For example, a
single lockowner string presented under two different clientids would
appear as two different entities.
Thus we have to choose between: 4.1.1. Addressing state merger in NFSv4.1
o indefinite prolongation of foreign clientids even after all The existing treatment of state transfer in [RFC5661], has similar
transferred state is gone. problems to that in [RFC7530] in that it assumes that the state for
multiple filesystems formerly on different servers will not be merged
so that it appears under a single common client ID. We've already
seen the reasons that this is a problem with regard to NFSv4.0.
o having multiple requests for the same lockowner-string-named Although we don't have the problems stemming from the non-uniform
entity carried on in parallel by separate identically named client-string approach, there are a number of complexities in the
lockowners under different clientid4's existing treatment of state management in the section entitled "Lock
State and File System Transitions" in [RFC5661] that make this non-
trivial to address:
o Adding serialization at the lock-owner string level, in addition o Migration is currently treated together with other sorts of
to that at the lockowner level. filesystem transitions including transitioning between replicas
without any NFS4ERR_MOVED errors.
In any case, we have gone (in adding migration as it was described) o There is separate handling and discussion of the cases of matching
from a situation in which and non-matching server scopes.
o Each client has a single clientid4/lease for each server it talks o In the case of matching server scopes, the text calls for an
to. unrealistic degree of transparency, suggesting that the source and
destination servers need to cooperate in stateid assignment.
o Each client has a single nfs_client_id4 for each server it talks o In the case of non-matching server scopes, the text does not
to. mention the possibility of the transparent migration of state at
all, resulting in a functional regression from NFSV4.0
o Every state id can be mapped to an associated lease based on the o The potential interaction between migration and trunking has not
server it was obtained from. been addressed.
To one in which o There is insufficient attention to the question of how clients can
deal with the complexities of recovering from migration. As part
of this, the implications of the shift of lease migration
notification shifting from an error (NFS4ERR_LEASE_MOVED in
NFSv4.0) to status bit (SEQ4_STATUS_LEASE_MOVED in NFSv4.1) need
to be explored.
o Each client may have multiple clientid4's for a single server. To summarize, there is a need for an NFSv4.1 treatment of Transparent
State Migration that is an extension of that in [RFC7931] and that
includes appropriate handling for NFSv4.1 features such as trunking.
o For each stateid, the client must separately record the clientid4 4.1.2. Addressing pNFS relationship with migration
that it is assigned to, or it must manage separate "state blobs"
for each fsid and map those to clientid4's.
o Before doing an operation that can result in a stateid, the client This is made difficult because, within the pNFS framework, migration
must either find a "state blob" based on fsid or create a new one, might mean any of several things:
possibly with a new clientid4.
o There may be multiple clientid4's all connected to the same server o Transfer of the MDS, leaving DS's as they are.
and using the same nfs_clientid4.
This sort of additional client complexity is troublesome and needs to This would be minimally disruptive to those using layouts but
be eliminated. would require the pNFS control protocol being used to support the
DS being directed to a new MDS.
3.2. Sources of Protocol Difficulties o Transfer of a DS, leaving everything else in place.
3.2.1. Issues with nfs_client_id4 Generation and Use Such a transfer can be handled without using migration at all.
The server can recall/revoke layouts, and issue new ones, as
appropriate.
In [RFC7530], the section entitled "Client ID" says: o Transfer of the filesystem to a new filesystem with both MDS and
DS's moving.
The second field, id is a variable length string that uniquely In such a transfer, an entirely different set of DS's will be at
defines the client. the target location. There may even be no pNFS support on the
destination filesystem at all.
There are two possible interpretations of the phrase "uniquely Migration needs to support both the first and last of these models.
defines" in the above:
o The relation between strings and clients is a function from such 4.1.3. Addressing server_owner changes in NFSv4.1
strings to clients so that each string designates a single client.
o The relation between strings and clients is a bijection between Section 2.10.5 of [RFC5661] states the following.
such strings and clients so that each string designates a single
client and each client is named by a single string.
The first interpretation would make these client-strings like phone The client should be prepared for the possibility that
numbers (a single person can have several) while the second would eir_server_owner values may be different on subsequent EXCHANGE_ID
make them like social security numbers. requests made to the same network address, as a result of various
sorts of reconfiguration events. When this happens and the
changes result in the invalidation of previously valid forms of
trunking, the client should cease to use those forms, either by
dropping connections or by adding sessions. For a discussion of
lock reclaim as it relates to such reconfiguration events, see
Section 8.4.2.1.
Debate about the possible meanings of "uniquely defines" in this While this paragraph is literally true in that such reconfiguration
context is quite possible but not very helpful. The following points events can happen and clients have to deal with them, it is confusing
should be noted though: in that it can be read as suggesting that clients have to deal with
them without disruption, which in general is impossible.
o The second interpretation is more consistent with the way A clearer alternative would be:
"uniquely defines" is used elsewhere in the spec.
o The spec as now written intends the first interpretation (or is It is always possible that, as a result of various sorts of
internally inconsistent). In fact, it recommends, although non- reconfiguration events, eir_server_scope and eir_server_owner
normatively, that a single client have at least as many client- values may be different on subsequent EXCHANGE_ID requests made to
strings as server addresses that it interacts with. It says, in the same network address.
the third bullet point regarding construction of the string (which
we shall henceforth refer to as client-string-BP3):
The string should be different for each server network address In most cases such reconfiguration events will be disruptive and
that the client accesses, rather than common to all server indicate that an IP address formerly connected to one server is
network addresses. now connected to an entirely different one.
o If internode interactions are limited to those between a client Some guidelines on client handling of such situations follow:
and its servers, there is no occasion for servers to be concerned
with the question of whether two client-strings designate the same
client, so that there is no occasion for the difference in
interpretation to matter.
o When transparent migration of client state occurs between two o When eir_server_scope changes, the client has no assurance that
servers, it becomes important to determine when state on two any id's it obtained previously (e.g. file handles) can be
different servers is for the same client or not, and this validly used on the new server, and, even if the new server
distinction becomes very important. accepts them, there is no assurance that this is not due to
accident. Thus it is best to treat all such state as lost/
stale although a client may assume that the probability of
inadvertent acceptance is low and treat this situation as
within the next case.
Given the need for the server to be aware of client identity with o When eir_server_scope remains the same and
regard to migrated state, either client-string construction rules eir_server_owner.so_major_id changes, the client can use
will have to change or there will be a need to get around current filehandles it has and attempt reclaims. It may find that
issues, or perhaps a combination of these two will be required. these are now stale but if NFS4ERR_STALE is not received, he
Later sections will examine the options and propose a solution. can proceed to reclaim his opens.
One consideration that may indicate that this cannot remain exactly o When eir_server_scope and eir_server_owner.so_major_id remain
as it has been derives from the fact that the current explanation for the same, the client has to use the now-current values of
this behavior is not correct. In [RFC7530], the section entitled eir_server-owner.so_minor_id in deciding on appropriate forms
"Client ID" says: of trunking.
The reason is that it may not be possible for the client to tell 4.1.4. Addressing Confirmation Status of Migrated Client IDs in NFSv4.1
if the same server is listening on multiple network addresses. If
the client issues SETCLIENTID with the same id string to each
network address of such a server, the server will think it is the
same client, and each successive SETCLIENTID will cause the server
to begin the process of removing the client's previous leased
state.
In point of fact, a "SETCLIENTID with the same id string" sent to When a client ID is transferred between systems as a part of
multiple network addresses will be treated as all from the same migration, it is not always clear whether it should be considered
client but will not "cause the server to begin the process of confirmed or unconfirmed on the target server. In the case in which
removing the client's previous leased state" unless the server an associated session is transferred together with the client ID, it
believes it is a different instance of the same client, i.e. if the is clear that the transferred client ID needs to be considered
id string is the same and there is a different boot verifier. If the confirmed, as the existence of an associated session is incompatible
client does not reboot, the verifier should not change. If it does with an unconfirmed client ID.
reboot, the verifier will change, and it is appropriate that the
server "begin the process of removing the client's previous leased
state.
The situation of multiple SETCLIENTID requests received by a server The case in which a client ID is transferred without an associated
on multiple network addresses is exactly the same, from the protocol session is less clear-cut and there needs to be a choice between two
design point of view, as when multiple (i.e. duplicate) SETCLIENTID possibilities:
requests are received by the server on a single network address. The
same protocol mechanisms that prevent erroneous state deletion in the
latter case prevent it in the former case. There is no reason for
special handling of the multiple-network-appearance case, in this
regard.
3.2.2. Issues with Lease Proliferation o Consider it unconfirmed, because of the lack of an associated
session. This makes it simpler for the client to determine
whether there is an associated session transferred at the same
time. However, it is inconsistent with the fact there are
stateids which have been transferred with the client ID.
It is often felt that this is a consequence of the client-string o Consider it confirmed, because it was confirmed on the source
construction issues, and it is certainly the case that the two are server and the transfer is not considered to have affected that.
closely connected in that non-uniform client-strings make it Although this makes it simpler for the client to determine whether
impossible for the server to appropriately combine leases from the there is an associated session transferred at the same time, an
same client. alternative is discussed in Section 4.1.5.
However, even where the server could combine leases from the same A related issue concerns the potential use the SEQ4_STATUS flags to
client, it needs to be clear how and when it will do so, so that the determine whether all or some of the state present on the source has
client will be prepared. These issues will have to be addressed at been transferred the destination server. This could be done using
various places in the protocol specification. either of the alternatives above but it is more in the spirit of the
second alternative. One potential use of these flags is discussed in
more detail in Section 4.2.2.
This could be enough only if we are prepared to do away with the 4.1.5. Addressing Session Migration in NFSv4.1
"should" recommending non-uniform client-strings and replace it with
a "should not" or even a "SHOULD NOT". Current client implementation
patterns make this an unpalatable choice for use as a general
solution, but it is reasonable to "RECOMMEND" this choice for a well-
defined subset of clients. One alternative would be to create a way
for the server to infer from client behavior which leases are held by
the same client and use this information to do appropriate lease
mergers. Prototyping and detailed specification work has shown that
this could be done but the resulting complexity is such that a better
choice is to "RECOMMEND" use of the uniform client-string approach
for clients supporting the migration feature.
Because of the discussion of client-string construction in [RFC7530], Some issues that need to be addressed regard the migration of
most existing clients implement the non-uniform client-string sessions, in addition to client IDs and stateids
approach. As a result, existing servers may not have been tested
with clients implementing uniform client-strings. As a consequence,
care must be taken to preserve interoperability between UCS-capable
clients and servers that don't tolerate uniform client strings for
one reason or another.
4. Resolution of NFSv4.0 Protocol Difficulties o It needs to be made clearer how the client can deal with the
possibility that sessions might or might not be transferred as
part of Transparent State Migration.
This section lists the changes that were necessary to resolve the o Rules need to be clarified regarding possible transfer of sessions
difficulties mentioned above. Such changes, along with other when either the source session is being used to access other file
clarifications found to be desirable during drafting and review are systems on source server or there is already a session connecting
contained in [RFC7931]. the client to the destination server.
4.1. Changes Regarding nfs_client_id4 Client-string o There needs to be more detail regarding how the protocol avoids
situations in which the same session is subject to concurrent
changes on two different servers at the same time.
It was decided to replace client-string-BP3 with the following text: 4.2. Possible Resolutions for NFSv4.1 Issues
The string MAY be different for each server network address that The subsections below explore some ways of dealing with the issues
the client accesses, rather than common to all server network discussed in Section 4.1
addresses.
In addition, given the importance of the issue of client identity and First we introduce some terminology we will be using in these
the fact that both client string-approaches are to be considered sections:
valid, a greatly expanded treatment of client identity was desirable.
It had the following major elements.
o Fully describing the consequences of making the string different o Location attributes include the fs_locations and fs_locations_info
for each network address (the non-uniform client-string approach) attributes.
and of making it the same for all network addresses (the uniform
client string approach).
o Giving helpful guidance about the factors that might affect client o Location entries are the individual file system locations in the
implementation choice between these approaches. location attributes.
o Describing the compatibility issues that might cause servers to be o Location elements are derived from location entries. If a
incompatible with the uniform approach and give guidance about location entry specifies an IP address there is only a single
dealing with these. corresponding location element. Location entries that contain a
host name, are resolved using DNS, and may result in one or more
location elements. All location elements consist of a location
address which is the IP address of an interface to a server and an
fs name which is the location of the file system within the
server's pseudo-fs. The fs name is empty if the server has no
pseudo-fs and only a single exported file system at the root
filehandle.
o Describing how a client using the uniform approach might use o Two location elements are trunkable if they specify the same fs
server behavior to determine server address trunking patterns. name and the location addresses are such that trunking of the
location addresses can be used as shown by the server_owner values
returned.
o Presenting a clearer and more complete set of recommendations to 4.2.1. Server Responsibilities in Effecting Transparent State Migration
guide client string construction.
4.2. Changes Regarding Merged (vs. Synchronized) Leases The basic responsibility of the source server in effecting
Transparent State Migration is to make available to the destination
server a description of each piece of locking state associated with
the file system being migrated. In addition to client id string and
verifier, the source server needs to provide. for each stateid:
In [RFC7530], the section entitled "Migration and State" says: o The stateid including the current sequence value.
As part of the transfer of information between servers, leases o The associated client ID.
would be transferred as well. The leases being transferred to the
new server will typically have a different expiration time from
those for the same client, previously on the old server. To
maintain the property that all leases on a given server for a
given client expire at the same time, the server should advance
the expiration time to the later of the leases being transferred
or the leases already present. This allows the client to maintain
lease renewal of both classes without special effort:
There are a number of problems with this and any resolution of our o The handle of the associated file.
difficulties must address them somehow.
o [RFC7530] recommends that the client make it essentially o The type of the lock, such as open, byte-range lock, delegation,
impossible to determine when two leases are from "the same layout.
client".
o It is not appropriate to speak of "maintain[ing] the property that o For locks such as opens and byte-range locks, there will be
all leases on a given server for a given client expire at the same information about the owner(s) of the lock.
time", since this is not a property that holds even in the absence
of migration. A server listening on multiple network addresses
may have the same client appear as multiple clients with no way to
recognize the client as the same.
o Even if the client identity issue could be resolved, advancing the o For recallable/revocable lock types, the current recall status
lease time at the point of migration would not maintain the needs to be included.
desired synchronization property. The leases would be
synchronized until one of them was renewed, after which they would
be unsynchronized again.
To avoid client complexity, we need to have no more than one lease o For each lock type there will by type-specific information, such
between a single client and a single server. This requires merger of as share and deny modes for opens and type and byte ranges for
leases since there is no real help from synchronizing them at a byte-range locks and layouts.
single instant.
For the uniform approach, the destination server would simply merge A further server responsibility concerns locks that are revoked or
leases as part of state transfer, since two leases with the same otherwise lost during the process of file system migration. Because
nfs_client_id4 values must be for the same client. locks that appear to be lost during the process of migration will be
reclaimed by the client, the servers have to take steps to ensure
that locks revoked soon before or soon after migration are not
inadvertently allowed to be reclaimed in situations in which the
continuity of lock possession cannot be assured.
We have made the following decisions as far as proposed normative o For locks lost on the source but whose loss has not yet been
statements regarding for state merger. They reflect the facts that acknowledged by the client (by using FREE_STATEID), the
we want to allow full migration support in the simplest way possible destination must be aware of this loss so that it can deny a
and that we can't say MUST since we have older clients and servers to request to reclaim them.
deal with.
o Clients MAY use the uniform client-string approach and are well- o For locks lost on the destination after the state transfer but
advised to do so if they are concerned about getting good before the client's RECLAIM_COMPLTE is done, the destination
migration support. server should note these and not allow them to be reclaimed.
o Servers SHOULD provide automatic lease merger during state A further responsibility of the servers concerns situations in which
migration so that clients using the uniform id approach get the stateid cannot be transferred transparently because it conflicts with
support automatically. an existing stateid held by the client and associated with a
different file systems. In this case there are two valid choices:
If servers obey the SHOULD and clients choose to adopt the uniform id o Treat the transfer, as in NFSv4.0, as one without Transparent
approach, having more than a single lease for a given client-server State Migration. In this case, conflicting locks cannot be
pair will be a transient situation, cleaned up as part of adapting to granted until the client does a RECLAIM_COMPLETE, after reclaiming
use of migrated state. the lock it had, with the exception of reclaims denied because
they were attempts to reclaim locks that had been lost.
Since clients and servers will be a mixture of old and new and o Implement Transparent State Migration, except for the lock with
because nothing is a MUST we have to ensure that no combination will the conflicting stateid. In this case, the client will be aware
show worse behavior than is exhibited by current (i.e. old) clients of a lost lock (through the SEQ4_STATUS flags) and be allowed to
and servers. reclaim it.
4.3. Other Changes to Migration-state Sections 4.2.2. Determining Initial Migration Status in NFSv4.1
4.3.1. Changes Regarding Client ID Migration
In [RFC7530], the section entitled "Migration and State" says: This section proposes a way in which a client which receives
NFS4ERR_MOVED can determine:
In the case of migration, the servers involved in the migration of o Whether the NFS4ERR_MOVED indicates migration has occurred, or
a filesystem SHOULD transfer all server state from the original to whether it indicates another sort of file system transition as
the new server. This must be done in a way that is transparent to discussed in Section 4.2.4
the client. This state transfer will ease the client's transition
when a filesystem migration occurs. If the servers are successful
in transferring all state, the client will continue to use
stateids assigned by the original server. Therefore the new
server must recognize these stateids as valid. This holds true
for the client ID as well. Since responsibility for an entire
filesystem is transferred with a migration event, there is no
possibility that conflicts will arise on the new server as a
result of the transfer of locks.
This poses some difficulties, mostly because the part about "client o In the case of migration, whether Transparent State Migration has
ID" is not clear: occurred.
o It isn't clear what part of the paragraph the "this" in the o Whether any state has been lost during the process of Transparent
statement "this holds true ..." is meant to signify. State Migration.
o The phrase "the client ID" is ambiguous, possibly indicating the o Whether sessions have been transferred as part of Transparent
clientid4 and possibly indicating the nfs_client_id4. State Migration.
o If the text means to suggest that the same clientid4 must be used, This is written assuming that the second option regarding client ID
the logic is not clear since the issue is not the same as for confirmation status after migration (as discussed in Section 4.1.4)
stateids of which there might be many. Adapting to the change of is adopted. However that choice is not essential to the procedure
a single clientid, as might happen as a part of lease migration, and could be changed.
is relatively easy for the client.
We have decided that it is best to address this issue as follows: The process begins by the client examining the location entries using
either of the location attributes. For those whose fs name matches
that currently being used, an EXCHANGE_ID is directed at the location
address and the server_owner and scope used to determine if the entry
is trunkable with that previously being used to access the file
system (i.e. that it represents another path to the same file system
and can share locking state with it). If it is, then this should be
treated as a transition from one set of paths to another, as
described in Section 4.2.4, rather than a migration event.
o Make it clear that both clientid4 and nfs_client_id4 (including Otherwuse, if one or more of the EXCHANGE_ID operations above has
both id string and boot verifier) are to be transferred. encountered a distinct server, then migration has occurred and the
procedure continues. If there were no location entries with a
matching fs name, then one with another fs name is selected, an
EXCHANGE_ID is done, and the procedure continues using the result of
that operation.
o Indicate that the initial transfer will result in the same The determination of whether Transparent State Migration has occurred
clientid4 after transfer but this is not guaranteed since there is driven by the client ID returned and its confirmation status.
may conflict with an existing clientid4 on the destination server
and because lease merger can result in a change of the clientid4.
4.3.2. Changes Regarding Callback Re-establishment o If the client ID is an unconfirmed client ID not previously known
to the client, then Transparent State Migration has not occurred.
In [RFC7530], the section entitled "Migration and State" says: o If the client ID is a confirmed client ID previously known to the
client, then any transferred state would have been merged with an
existing client ID representing the client to the destination
server. In this state merger case, Transparent State Migration
might or might not have occurred.
A client SHOULD re-establish new callback information with the new o If the client ID is a confirmed client ID not previously known to
server as soon as possible, according to sequences described in the client, then the client can conclude that the client ID was
sections "Operation 35: SETCLIENTID - Negotiate Client ID" and transferred as part of Transparent State Migration. In this
"Operation 36: SETCLIENTID_CONFIRM - Confirm Client ID". This transferred client ID case, Transparent State Migration has
ensures that server operations are not blocked by the inability to occurred although some state may have been lost.
recall delegations.
The above will need to be fixed to reflect the possibility of merging In the state merger case, it is possible that the server has not
of leases, attempted Transparent State Migration, in which case state may have
been lost without it being reflected in the SEQ4_STATUS bits. To
determine whether this has happened, the client can use TEST_STATEID
to check whether the stateids created on the source server are still
accessible on the destination server. Once a single stateid is found
to have been successfully transferred, the client can conclude that
Transparent State Migration was begun and any failure to transport
all of the stateids will be reflected in the SEQ4_STATUS bits.
4.3.3. NFS4ERR_LEASE_MOVED Rework In any of the cases in which Transparent State Migration has
occurred, it is possible that a session was transferred as well. To
deal with that possibility, clients can, after doing the EXCHANGE_ID,
issue a BIND_CONN_TO_SESSION to connect the transferred session to a
connection to the new server. If that fails, it is an indication
that the session was not transferred and that a new session needs to
be created to take its place.
In [RFC7530], the section entitled "Notification of Migrated Lease" 4.2.3. Client Response to Migration in NFSv4.1
says:
Upon receiving the NFS4ERR_LEASE_MOVED error, a client that Once the client has determined the initial migration status, it needs
supports filesystem migration MUST probe all filesystems from that to re-establish its lock state, if possible. To enable this to
server on which it holds open state. Once the client has happen without loss of the guarantees normally provided by locking,
successfully probed all those filesystems which are migrated, the the destination server needs to implement a per-fs grace period in
server MUST resume normal handling of stateful requests from that all cases in which lock state was lost, including those in which
client. Transparent State Migration was not implemented.
There is a lack of clarity that is prompted by ambiguity about what The following cases need to be dealt with:
exactly probing is and what the interlock between client and server
must be. This has led to some worry about the scalability of the
probing process, and although the time required does scale linearly
with the number of filesystems that the client may have state for
with respect to a given server, the actual process can be done
efficiently.
To address these issues, the text above had to be rewritten to be o In a case in which Transparent State Migration has not occurred,
more clear and to give suggestions about how to do the required the client can use the per-fs grace period provided by the
scanning efficiently. destination server to reclaim locks that were held on the source
server.
4.4. Changes to Other Sections o In a cases in which Transparent State Migration has occurred, and
no lock state was lost (as shown by SEQ4_STATUS flags), no lock
reclaim is necessary.
4.4.1. Need for Additional Changes o In a case in which Transparent State Migration has occurred, and
some lock state was lost (as shown by SEQ4_STATUS flags), existing
stateids need to be checked for validity using TEST_STATEID, and
reclaim used to re-establish any that were not transferred.
There are a number of cases in which certain sections, not For all of the cases above, RECLAIM_COMPLETE with an rca_one_fs value
specifically related to migration, require additional clarification. of true should be done before normal use of the file system including
This is generally because text that is clear in a context in which obtaining new locks for the file system. This applies even if no
leases and clientids are created in one place and live there forever locks were lost and needed to be reclaimed.
may need further refinement in the more dynamic environment that
arises as part of migration.
Some examples: 4.2.4. Dealing with Multiple Location Entries
o Some people are under the impression that updating callback The possibility that more than one server address may be present in
endpoint information for an existing client, as used during location attributes requires further clarification. This is
migration, may cause the destination server to free existing particularly the case, given the potential role of trunking for
state. There need to be additions to clarify the situation. NFSv4.1, whose connection to migration needs to be clarified.
o The handling of the sets of clientid4's maintained by each server The description of the location attributes in [RFC5661], while it
needs to be clarified. In particular, the issue of how the client indicates that multiple address entries in these attributes may be
adapts to the presumably independent and uncoordinated clientid4 used to indicate alternate paths to the file system, does so mainly
sets needs to be clearly addressed in the context of replication and does so without mentioning
trunking. The discussion of migration does not discuss the
possibility of multiple location entries or trunking, which we will
explore here.
o Statements regarding handling of invalid clientid4's need to be We will cover cases in which multiple addresses appear directly in
clarified and/or refined in light of the possibilities that arise the attributes as well as those in which the multiple addresses
due to lease motion and merger. result because a single location entry is expanded into multiple
location elements using addresses provided by DNS.
o Confusion and lack of clarity about NFS4ERR_CLID_INUSE. When the set of valid location elements by which a file system may be
accessed changes, migration need not be involved. Some cases to
consider:
4.4.2. Callback Update o When the set of location elements expands, migration is not
involved. In the case in which the additional elements are not
trunkable with ones previously being used, the new elements serve
as additional access locations, available in case of the failure
of server addresses being used. When additional elements are
trunkable with those currently being used the client may use the
additional addresses just as they might have if they had been
available when use of the file system began.
Some changes are necessary to reduce confusion about the process of There is no current mechanism by which the client can be notified
callback information update and in particular to make it clear that of a change in the set of available location for an fs. Given the
no state is freed as a result: client has at least one IP address available to access the
filesystem in question, periodic polling is an adequate mechanism
for the client to find additional server addresses to use to
access the file system.
o Make it clear that after migration there are confirmed entries for o When the set of location elements contracts but none of the
transferred clientid4/nfs_client_id4 pairs. elements no longer usable were in fact being used by the client,
then no migration is involved. Only if the client were to start
using one of the unavailable elements will the client be notified
(via NFS4ERR_MOVED) of the need to not use those elements and to
use others provided by a location attribute.
o Be explicit in the sections headed "otherwise," in the When a specific server address being used becomes unavailable to
descriptions of SETCLIENTID and SETCLIENTID_CONFIRM, that these service a particular file system, NF4ERR_MOVED will be returned, and
don't apply in the cases we are concerned about. the client will respond based on the available locations. Whether
continuity of locking state will be available depends on a number of
factors:
4.4.3. clientid4 Handling o If there are still elements in use trunkable with the element that
has become unavailable, there will still be a continuity of
locking state, even though Transparent State Migration per se has
not occurred. If the in-use addresses are session-trunkable with
the address becoming unavailable, only one connection is lost and
all existing sessions will remain available. If, on the other
hand, the in-use addresses are only clientid-trunkable with the
address becoming unavailable, a session can be lost. However,
that session can be made available on those other nodes, just as
they it would have been if Transparent State Migration were in
effect, even though no migration has occurred.
To address both of the clientid4-related issues mentioned in o Otherwise, if there are available addresses trunkable with the one
Section 4.4.1, it was necessary to replace the last three paragraphs that has become unavailable, the client has access to existing
of the section entitled "Client ID" with the following: locking state once it establishes a connection with the new
addresses, using a new or existing session depending on the type
of trunking in effect. This is also similar to the case in which
Transparent State Migration has occurred, even though there is no
migration, with the state remaining on the existing server.
Once a SETCLIENTID and SETCLIENTID_CONFIRM sequence has Note that this case, as well as the previous one, can be expected
successfully completed, the client uses the shorthand client in the case in which the server seeks to direct traffic with
identifier, of type clientid4, instead of the longer and less regard to particular file systems to choose addresses, in the
compact nfs_client_id4 structure. This shorthand client interest of load balancing, to adjust to hardware availability
identifier (a client ID) is assigned by the server and should be constraints, or for other reasons.
chosen so that it will not conflict with a client ID previously
assigned by same server. This applies across server restarts or
reboots.
Distinct servers MAY assign clientid4's independently, and will o In other cases, migration has occurred and the client can use the
generally do so. Therefore, a client has to be prepared to deal procedure described in Section 4.2.2 to determine whether
with multiple instances of the same clientid4 value received on Transparent State Migration occurred and whether any locking state
distinct IP addresses, denoting separate entities. When trunking was lost during the transfer.
of server IP addresses is not a consideration, a client should
keep track of (IP-address, clientid4) pairs, so that each pair is
distinct. In the face of possible trunking of server IP
addresses, the client will use the receipt of the same clientid4
from multiple IP-addresses, as an indication that the two IP-
addresses may be trunked and proceed to determine, from the
observed server behavior whether the two addresses are in fact
trunked.
When a clientid4 is presented to a server and that clientid4 is One should note the following differences between migration with
not recognized, the server will reject the request with the error Transparent State Migration and the similar cases in which there is a
NFS4ERR_STALE_CLIENTID. This can occur for a number of reasons: continuity of locking state with no change in the server.
* A server reboot causing loss of the server's knowledge of the o When locks are lost (as indicated when using them or via the
client SEQ4_STAUS flags) and migration has not been done, they are not to
be reclaimed. Instead such losses are treated as lock revocations
and acknowledged using FREE_STATEID.
* Client error sending an incorrect clientid4 or a valid o When migration has not been done, there is no need for a
clientid4 to the wrong server. RECLAIM_COMPLETE (with rca_one_fs set to true).
* Loss of lease state due to lease expiration. 4.2.5. Client Recovery from Migration Events
* Client or server error causing the server to believe that the When a file system is migrated, there a number of migration-related
client has rebooted (i.e. receiving a SETCLIENTID with an status indications with which clients need to deal:
nfs_client_id4 which has a matching id string and a non-
matching boot verifier).
* Migration of all state under the associated lease causes its o If an attempt is made to use or return a filehandle within a file
non-existence to be recognized on the source server. system that has been migrated away from the server on which it was
previously available, the error NFS4ERR_MOVED is returned.
* Merger of state under the associated lease with another lease This condition continues on subsequent attempts to access the file
under a different clientid causes the clientid4 serving as the system in question. The only way the client can avoid the error
source of the merge to cease being recognized on its server. is to cease accessing the filesystem in question at its old server
location and access it instead on the server to which it has been
migrated.
In the event of a server reboot, or loss of lease state due to o Whenever a SEQUENCE operation is sent by a client to a server
lease expiration, the client must obtain a new clientid4 by use of which generated state held on that client which is associated with
the SETCLIENTID operation and then proceed to any other necessary a file system that has been migrated away from the server on which
recovery for the server reboot case (See the section entitled it was previously available, the status bit
"Server Failure and Recovery"). In cases of server or client SEQ4_STATUS_LEASE_MOVED is set in the response.
error resulting in this error, use of SETCLIENTID to establish a
new lease is desirable as well.
In the last two cases, different recovery procedures are required. This condition continues until the client acknowledges the
Note that in cases in which there is any uncertainty about which notification by fetching a location attribute for the migrated
sort of handling is applicable, the distinguishing characteristic file system. When there are multiple migrated file systems, a
is that in reboot-like cases, the clientid4 and all associated location attribute for each such migrated file system needs to be
stateids cease to exist while in migration-related cases, the fetched, in order to clear the condition. Even after the
clientid4 ceases to exist while the stateids are still valid. condition is cleared, the client needs to respond by using the
location information to access the destination server to ensure
that leases are not needlessly expired.
The client must also employ the SETCLIENTID operation when it Unlike the case of NFSv4.0 in which the corresponding conditions are
receives a NFS4ERR_STALE_STATEID error using a stateid derived both errors, in NFSv4.1 the client can, and often will, receive both
from its current clientid4, since this indicates a situation, such indications on the same request. As a result, the question of how to
as server reboot which has invalidated the existing clientid4 and co-ordinate the necessary recovery actions when both indications
associated stateids (see the section entitled "lock-owner" for arrive simultaneously must be resolved. It should be noted that when
details). the server decides whether SEQ4_STATUS_LEASE_MOVED is ti be set, it
has no way of knowing which file system will be referenced or whether
NFS4ERR_MOVED will be returned.
See the detailed descriptions of SETCLIENTID and While it is true that, when only a single migrated file system is
SETCLIENTID_CONFIRM for a complete specification of the involved, a single set of actions will clear both indications, the
operations. possibility of multiple migrated file systems calls for an approach
in which there are separate recovery actions for each indication. In
general, the response to neither indication can be subsumed within
the other since:
4.4.4. Handling of NFS4ERR_CLID_INUSE o If the client were to respond only to the MOVED indication, there
would be no effective client response to a situation in which a
file system was not being actively accessed at the time migration
occurred. As a result, leases on the destination server might be
needlessly expired.
It appears to be the intention that only a single principal be used o If the client were to respond only to the LEASE_MOVED indication,
for client establishment between any client-server pair. However: recovery for migrated file systems in active use could be deferred
in order to accomplish recovery for others not being actively
accessed. The consequences of this choice can pose particular
problems when there are a large number of file systems supported
by a particular server, or when it happens that some servers,
after receiving migrated file systems have periods of
unavailability, such as occur as a result of server reboot. This
can result in recovery for actively accessed migrated file systems
being unnecessarily delayed for long periods of time.
o There is no explicit statement to this effect. Similar considerations apply to other arrangements in which one of
the indications, while not ignored per se, is subsumed within a
single recovery process focused on recovery for the other indication.
o The error that indicates a principal conflict has a name which Generally speaking, client recovery for these indications should have
does not clarify this issue: NFS4ERR_CLID_INUSE. the following characteristics:
o The definition of the error is also not very helpful: "The o All instances of the MOVED indication should be dealt with
SETCLIENTID operation has found that a client id is already in use promptly, either by doing the necessary recovery directly,
by another client". providing that it be done asynchronously, or ensuring that it is
already under way.
As a result, servers exist which reject a SETCLIENTID simply because o All instances of the LEASE_MOVED indication should be dealt with
there already exists a clientid for the same client, established asynchronously, in a migration discovery thread whose job is to
using a different IP address. Although this is generally understood clear that indication by fetching the appropriate location
to be erroneous, such servers still exist and the spec should make attribute. Because this thread will only be fetching a location
the correct behavior clear. attribute and the fs_status attribute for the file systems
referenced by the client, it cannot receive MOVED indications.
Some useful guidance regarding possible implementation of the
migration discovery thread can be found in Section 4.2.6.
Although the error name cannot be changed, the following changes o When a migration discovery thread happens upon a migrated file
should be made to avoid confusion: system (i.e. not present and not a referral), the thread is likely
to have cleared one (out of an unknown number) of file systems
whose migration needs to be responded to. The discovery thread
needs to schedule the appropriate migration recovery (as described
in Section 4.2.3). This is necessary to ensure that migrated file
systems will be referenced on the destination server in order to
avoid lease expiration
o The definition of the error should be changed to read as follows: For many of the migrated file systems discovered in this way, the
client has not received any MOVED indication. In such cases,
lease recovery needs to be scheduled but it should not interfere
with continuation of the migration discovery function.
The SETCLIENTID operation has found that the specified o When a migration discovery thread receives a LEASE_MOVED
nfs_client_id4 was previously presented with a different indication, it takes no special action but continues its normal
principal and that client instance currently holds an active operation. On the other hand, if a LEASE_MOVED indication is not
lease. A server MAY return this error if the same principal is received, it indicates that the thread has completed its work
used but a change in authentication flavor gives good reason to successfully.
reject the new SETCLIENTID operation as not bona fide.
o In the description of SETCLIENTID, the phrase "then the server 4.2.6. The Migration Discovery Process
returns a NFS4ERR_CLID_INUSE error" should be expanded to read
"then the server returns a NFS4ERR_CLID_INUSE error, since use of
a single client with multiple principals is not allowed."
5. Issues for NFSv4.1 As noted above, LEASE_MOVED indications are best dealt with in a
migration discovery thread. Because of this structure,
Because NFSv4.1 embraces the uniform client-string approach, as o No action needs to be taken for such indications received by the
advised by section 2.4 of [RFC5661], addressing migration issues is migration discovery threads, since continuation of that thread's
simpler. work will address the issue.
Nevertheless, there are some issues that will have to be addressed. o For such indications received in other contexts, the generally
Some examples: appropriate response is to initiate or otherwise provide for the
execution of a migration discovery thread for file systems
associated with the server IP address returning the indication.
o The other necessary part of addressing migration issues, providing o In all cases in which the appropriate migration discovery thread
for the server's merger of leases that relate to the same client, is running, nothing further need be done to respond to LEASE_MOVED
is not currently addressed by NFSv4.1 and changes need to be made indications.
to make it clear that state needs to be appropriately merged as
part of migration, to avoid multiple clientids between a client-
server pair.
o There needs to be some clarification of how migration, and This leaves a potential difficulty in situations in which the
particularly transparent state migration, should interact with migration discovery thread is near to completion but is still
pNFS layouts. operating. One should not ignore a LEASE_MOVED indication if the
discovery thread is not able to respond to migrated file system
without additional aid. A further difficulty in addressing such
situation is that a LEASE_MOVED indication may reflect the server's
state at the time the SEQUENCE operation was processed, which may be
different from that in effect at the time the response is received.
o The current discussion (in [RFC5661]), of the possibility of A useful approach to this issue involves the use of separate
server_owner changes is incomplete and confusing. externally-visible discovery thread states representing non-
operation, normal operation, and completion/verification of migration
discovery processing.
Discussion of how to resolve these issues will appear in the sections Within that framework, discovery thread processing would proceed as
below. follows.
5.1. Addressing state merger in NFSv4.1 o While in the normal-operation state, the thread would fetch, for
successive file systems known to the client on the server being
worked on, a location attribute plus the fs_status attribute.
The existing treatment of state transfer in [RFC5661], has similar o If the fs_status attribute indicates that the file system is a
problems to that in [RFC7530] in that it assumes that the state for migrated one (i.e. fss_absent is true and fss_type !=
multiple filesystems on different servers will not be merged to so STATUS4_REFERRAL) and thus that it is likely that the fetch of the
that it appears under a single common clientid. We've already seen location attribute has cleared one the file systems contributing
the reasons that this is a problem, with regard to NFSv4.0. to the LEASE_MOVED indication.
Although we don't have the problems stemming from the non-uniform o In cases in which that happened, the thread cannot know whether
client-string approach, there are a number of complexities in the the LEASE_MOVED indication has been cleared and so it enters the
existing treatment of state management in the section entitled "Lock completion/verification state and proceeds to issue a COMPOUND to
State and File System Transitions" in [RFC5661] that make this non- see if the LEASE_MOVED indication has been cleared.
trivial to address:
o Migration is currently treated together with other sorts of o When the discovery thread is in the completion/verification state,
filesystem transitions including transitioning between replicas if others get a LEASE_MOVED indication they note this fact and it
without any NFS4ERR_MOVED errors. is used when the request completes, as described below.
o There is separate handling and discussion of the cases of matching When the request used in the completion/verification state completes:
and non-matching server scopes.
o In the case of matching server scopes, the text calls for an o If a LEASE_MOVED indication is returned, the discovery thread
impossible degree of transparency. resumes its normal work.
o In the case of non-matching server scopes, the text does not o Otherwise, if there is any record that other requests saw a
mention transparent state migration at all, resulting in a LEASE_MOVED indication, that record is cleared and the
functional regression from NFSV4.0 verification request retried. The discovery thread remains in
completion/verification state.
5.2. Addressing pNFS relationship with migration o If there has been no LEASE_MOVED indication, the work of the
discovery thread is considered completed and it enters the non-
operating state.
This is made difficult because, within the PNFS framework, migration 4.2.7. Synchronzing Session Transfer
might mean any of several things:
o Transfer of the MDS, leaving DS's alone. When transferring state between the source and destination, the
issues discussed in Section 7.2 of [RFC7931] must still be attended
to. In this case, the use of NFS4ERR_DELAY is still necessary in
NFSv4.1, as it was in NFSv4.0, to prevent locking state changing
while it is being transferred.
This would be minimally disruptive to those using layouts but There are a number of important differences in the NFS4.1 context:
would require the pNFS control protocol to support the DS being
directed to a new MDS.
o Transfer of a DS, leaving everything else in place. o The absence of RELEASE_LOCKOWNER means that the one case in which
an operation could not be deferred by use of NFS4ERR_DELAY no
longer exists.
Such a transfer can be handled without using migration at all. o Sequencing of operations is no longer done using owner-based
The server can recall/revoke layouts, as appropriate. operation sequences numbers. Instead, sequencing is session-
based
o Transfer of the filesystem to a new filesystem with both MDS and As a result, when sessions are not transferred, the techniques
DS's moving. discussed in [RFC7931] are adequate and will not be further
discussed.
In such a transfer, an entirely different set of DS's will be at When sessions are transferred, there are a number of issues that pose
the target location. There may even be no pNFS support on the challenges since,
destination filesystem at all.
Migration needs to support both the first and last of these models. o A single session may be used to access multiple file systems, not
all of which are being transferred.
5.3. Addressing server owner changes in NFSv4.1 o Requests made on a session, even if rejected may, affect the state
of the session by advancing the sequence number associated with
the slot used.
Section 2.10.5 of [RFC5661] states the following. As a result, when the filesystem state might otherwise be considered
unmodifiable, the client might have any number of in-flight requests,
each of which is capable of changing session state, which may be of a
number of types:
The client should be prepared for the possibility that 1. Those requests that were processed on the migrating file system,
eir_server_owner values may be different on subsequent EXCHANGE_ID before migration began.
requests made to the same network address, as a result of various
sorts of reconfiguration events. When this happens and the
changes result in the invalidation of previously valid forms of
trunking, the client should cease to use those forms, either by
dropping connections or by adding sessions. For a discussion of
lock reclaim as it relates to such reconfiguration events, see
Section 8.4.2.1.
While this paragraph is literally true in that such reconfiguration 2. Those requests which got the error NFS4ERR_DELAY because the file
events can happen and clients have to deal with them, it is confusing system being accessed was in the process of being migrated.
in that it can be read as suggesting that clients have to deal with
them without disruption, which in general is impossible.
A clearer alternative would be: 3. Those requests which got the error NFS4ERR_MOVED because the file
system being accessed had been migrated.
It is always possible that, as a result of various sorts of 4. Those requests that accessed the migrating file system, in order
reconfiguration events, eir_server_scope and eir_server_owner to obtain location or status information.
values may be different on subsequent EXCHANGE_ID requests made to
the same network address.
In most cases such reconfiguration events will be disruptive and 5. Those requests that did not reference the migrating file system.
indicate that an IP address formerly connected to one server is
now connected to an entirely different one.
Some guidelines on client handling of such situations follow: It should be noted that the history of any particular slot is likely
to include a number of these request classes. In the case in which a
session which is migrated is used by filesystems other than the one
migrated, requests of class 5 may be common and be the last request
processed, for many slots.
* When eir_server_scope changes, the client has no assurance that Since session state can change even after the locking state has been
any id's it obtained previously (e.g. file handles) can be fixed as part of the migration process, the session state known to
validly used on the new server, and, even if the new server the client could be different from that on the destination server,
accepts them, there is no assurance that this is not due to which necessarily reflects the session state on the source server, at
accident. Thus it is best to treat all such state as lost/ an earlier time. In deciding how to deal with this situation, it is
stale although a client may assume that the probability of helpful to distinguish between two sorts of behavioral consequences
inadvertent acceptance is low and treat this situation as of the choice of initial sequence ID values.
within the next case.
* When eir_server_scope remains the same and o The error NFS4ERR_SEQ_MISORDERED is returned when the sequence ID
eir_server_owner.so_major_id changes, the client can use in a request is neither equal to the last one seen for the current
filehandles it has and attempt reclaims. It may find that slot nor the next greater one.
these are now stale but if NFS4ERR_STALE is not received, he
can proceed to reclaim his opens.
* When eir_server_scope and eir_server_owner.so_major_id remain In view of the difficulty of arriving at a mutually acceptable
the same, the client has to use the now-current values of value for the correct last sequence a the point of migration, it
eir_server-owner.so_minor_id in deciding on appropriate forms may be necessary for the server to show some degree of
of trunking. forbearance, when the sequence ID is one that would be considered
unacceptable if session migration were not involved.
6. Security Considerations o Returning the cached reply for a previously executed request when
the sequence ID in the request matches the last value recorded for
the slot.
In the cases in which an error is returned and there is no
possibility of any non-idempotent operation having been executed,
it may not be necessary to adhere to this as strictly as might be
proper if session migration were not involved. For example, the
fact that the error NFS4ERR_DELAY was returned may not assist the
client in any material way, while the fact that NFS4ERR_MOVED was
returned by the source server may not be relevant when the request
was reissued, directed to the destination server.
One part of adapting to these sorts of issues would restrict
enforcement of normal slot sequence enforcement semantics until the
client itself, by issuing a request using a particular slot on the
destination server, established the new starting sequence for that
slot on the migrated session.
An important issue is that the specification needs to take note of
all potential COMPOUNDs, even if they might be unlikely in practice.
For example, a COMPOUND is allow to access multiple file systems and
might perform non-idempotent operations in some of them before
accessing a file system being migrated. Also, a COMPOUND may return
considerable data in the response, before being rejected with
NFS4ERR_DELAY or NFS4ERR_MOVED, and may in addition be marked as
sa_cachethis.
Some possibilities that need to be considered to address the issues:
o Do not enforce any sequencing semantics for a particular slot
until the client has established the starting sequence for that
slot on the destination server.
o For each slot, do not return a cached reply returning
NFS4ERR_DELAY or NFS4ERR_MOVED until the client has established
the starting sequence for that slot on the destination server.
o Until the client has established the starting sequence for a
particular slot on the destination server, do not report
NFS4ERR_SEQ_MISORDERED or return a cached reply returning
NFS4ERR_DELAY or NFS4ERR_MOVED, where the reply consists solely of
a series of operations where the response is NFS4_OK until the
final error.
4.2.8. Migration and pNFS
When pNFS is involved, migration is capable of supporting:
o Migration of the MDS, leaving DS's in place.
o Migration of the file system as a whole, including the MDS and
associated DS's.
o Replacement of one DS by another.
o Migration of a pNFS file system to one in which pNFS is not used.
o Migration of a file system not using pNFS to one in which layouts
are available.
Migration of the MDS function is directly supported by Transparent
State Migration. Layout state will normally be transparently
transferred, just as other state is. As a result, Transparent State
Migration provides a framework in which, given appropriate inter-MDS
data transfer, one MDS can be substituted for another.
Migration of the file system function can be accomplished by
recalling all layouts as part of the initial phase of the migration
process. As a result, IO will be done through the MDS during the
migration process, and new layouts can be granted once the client is
interacting with the new MDS. An MDS can also effect this sort of
transition by revoking all layouts as part of Transparent State
Migration, as long as the client is notified about the loss of state.
In order to allow migration to a file system on which pNFS is not
supported, clients need to be prepared for a situation in layouts are
not available or supported on the destination file system and be
prepared to direct IO request to the destination server, rather than
depending on layouts being available.
Replacement of one DS by another is not addressed by migration as
such but can be effected by an MDS recalling layouts for the DS to be
replaced and issuing new ones to be served by the successor DS.
Migration may transfer a file system from a server which does not
support pNFS to one which does. In order to properly adapt to this
situation, clients which support pNFS, but function adequately in its
absence, should check for pNFS support when a file system is migrated
and be prepared to use pNFS when support is available.
5. Security Considerations
With regard to NFSv4.0, the Security Considerations section of With regard to NFSv4.0, the Security Considerations section of
[RFC7530] encourages clients to protect the integrity of the SECINFO [RFC7530] encourages clients to protect the integrity of the SECINFO
operation, any GETATTR operation for the fs_locations attribute. A operation, any GETATTR operation for the fs_locations attribute. A
needed change is to include the operations SETCLIENTID/ needed change is to include the operations SETCLIENTID/
SETCLIENTID_CONFIRM as among those for which integrity protection is SETCLIENTID_CONFIRM as among those for which integrity protection is
recommended. A migration recovery event can use any or all of these recommended. A migration recovery event can use any or all of these
operations. operations.
With regard to NFSv4.1, the Security Considerations section of With regard to NFSv4.1, the Security Considerations section of
[RFC5661] takes proper care of migration-related issues. No change [RFC5661] takes proper care of migration-related issues. No change
is needed. is needed.
7. IANA Considerations 6. IANA Considerations
This document does not require actions by IANA. This document does not require actions by IANA.
8. Acknowledgements 7. Normative References
The editor and authors of this document gratefully acknowledge the
contributions of Trond Myklebust of NetApp and Robert Thurlow of
Oracle. We also thank Tom Haynes of NetApp and Spencer Shepler of
Microsoft for their guidance and suggestions.
Special thanks go to members of the Oracle Solaris NFS team,
especially Rick Mesta and James Wahlig, for their work implementing
an NFSv4.0 migration prototype and identifying many of the issues
documented here.
9. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<http://www.rfc-editor.org/info/rfc2119>. <http://www.rfc-editor.org/info/rfc2119>.
[RFC5661] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., [RFC5661] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed.,
"Network File System (NFS) Version 4 Minor Version 1 "Network File System (NFS) Version 4 Minor Version 1
Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010, Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010,
<http://www.rfc-editor.org/info/rfc5661>. <http://www.rfc-editor.org/info/rfc5661>.
[RFC7530] Haynes, T., Ed. and D. Noveck, Ed., "Network File System [RFC7530] Haynes, T., Ed. and D. Noveck, Ed., "Network File System
(NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530, (NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530,
March 2015, <http://www.rfc-editor.org/info/rfc7530>. March 2015, <http://www.rfc-editor.org/info/rfc7530>.
[RFC7931] Noveck, D., Ed., Shivam, P., Lever, C., and B. Baker, [RFC7931] Noveck, D., Ed., Shivam, P., Lever, C., and B. Baker,
"NFSv4.0 Migration: Specification Update", RFC 7931, "NFSv4.0 Migration: Specification Update", RFC 7931,
DOI 10.17487/RFC7931, July 2016, DOI 10.17487/RFC7931, July 2016,
<http://www.rfc-editor.org/info/rfc7931>. <http://www.rfc-editor.org/info/rfc7931>.
Appendix A. Acknowledgements
The editor and authors of this document gratefully acknowledge the
contributions of Trond Myklebust of NetApp and Robert Thurlow of
Oracle. We also thank Tom Haynes of Primary Data and Spencer Shepler
of Microsoft for their guidance and suggestions.
Special thanks go to members of the Oracle Solaris NFS team,
especially Rick Mesta and James Wahlig, for their work implementing
an NFSv4.0 migration prototype and identifying many of the issues
documented here.
Authors' Addresses Authors' Addresses
David Noveck (editor) David Noveck (editor)
NetApp
26 Locust Avenue 26 Locust Avenue
Lexington, MA 02421 Lexington, MA 02421
US US
Phone: +1 781 572 8038 Phone: +1 781 572 8038
Email: davenoveck@gmail.com Email: davenoveck@gmail.com
Piyush Shivam Piyush Shivam
Oracle Corporation Oracle Corporation
5300 Riata Park Ct. 5300 Riata Park Ct.
 End of changes. 197 change blocks. 
642 lines changed or deleted 800 lines changed or added

This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/