draft-ietf-nfsv4-rfc3530-migration-update-00.txt   draft-ietf-nfsv4-rfc3530-migration-update-01.txt 
NFSv4 D. Noveck, Ed. NFSv4 D. Noveck, Ed.
Internet-Draft EMC Internet-Draft EMC
Updates: 3530 (if approved) P. Shivam Updates: 3530 (if approved) P. Shivam
Intended status: Standards Track C. Lever Intended status: Standards Track C. Lever
Expires: May 18, 2013 B. Baker Expires: August 19, 2013 B. Baker
ORACLE ORACLE
November 14, 2012 February 15, 2013
NFSv4.0 migration: Specification Update NFSv4.0 migration: Specification Update
draft-ietf-nfsv4-rfc3530-migration-update-00 draft-ietf-nfsv4-rfc3530-migration-update-01
Abstract Abstract
The migration feature of NFSv4 allows for responsibility for a single The migration feature of NFSv4 allows for responsibility for a single
filesystem to move from one server to another, without disruption to filesystem to move from one server to another, without disruption to
clients. Recent implementation experience has shown problems in the clients. Recent implementation experience has shown problems in the
existing specification for this feature in NFSv4.0. This document existing specification for this feature in NFSv4.0. This document
clarifies and corrects the NFSv4.0 specification (RFC3530 and clarifies and corrects the NFSv4.0 specification (RFC3530 and
possible successors) to address these problems. possible successors) to address these problems.
skipping to change at page 1, line 38 skipping to change at page 1, line 38
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 18, 2013. This Internet-Draft will expire on August 19, 2013.
Copyright Notice Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
skipping to change at page 2, line 27 skipping to change at page 2, line 27
4.3. Server Release of Client ID . . . . . . . . . . . . . . . 9 4.3. Server Release of Client ID . . . . . . . . . . . . . . . 9
4.4. client id string Approaches . . . . . . . . . . . . . . . 10 4.4. client id string Approaches . . . . . . . . . . . . . . . 10
4.5. Non-Uniform client id string Approach . . . . . . . . . . 12 4.5. Non-Uniform client id string Approach . . . . . . . . . . 12
4.6. Uniform client id string Approach . . . . . . . . . . . . 13 4.6. Uniform client id string Approach . . . . . . . . . . . . 13
4.7. Mixing client id string Approaches . . . . . . . . . . . . 14 4.7. Mixing client id string Approaches . . . . . . . . . . . . 14
4.8. Trunking Determination when using Uniform client id 4.8. Trunking Determination when using Uniform client id
strings . . . . . . . . . . . . . . . . . . . . . . . . . 16 strings . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.9. Client id string construction details . . . . . . . . . . 21 4.9. Client id string construction details . . . . . . . . . . 21
5. Locking and Multi-Server Namespace . . . . . . . . . . . . . . 22 5. Locking and Multi-Server Namespace . . . . . . . . . . . . . . 22
5.1. Changes from Replaced Sections . . . . . . . . . . . . . . 22 5.1. Changes from Replaced Sections . . . . . . . . . . . . . . 22
5.2. Lock State and File System Transitions . . . . . . . . . . 22 5.2. Lock State and Filesystem Transitions . . . . . . . . . . 22
5.3. Migration and State . . . . . . . . . . . . . . . . . . . 23 5.3. Migration and State . . . . . . . . . . . . . . . . . . . 23
5.4. Replication and State . . . . . . . . . . . . . . . . . . 25 5.3.1. Migration and clientid's . . . . . . . . . . . . . . . 24
5.5. Notification of Migrated Lease . . . . . . . . . . . . . . 25 5.3.2. Migration and state owner information . . . . . . . . 25
5.6. Migration and the Lease_time Attribute . . . . . . . . . . 28 5.4. Replication and State . . . . . . . . . . . . . . . . . . 28
6. Additional Changes . . . . . . . . . . . . . . . . . . . . . . 28 5.5. Notification of Migrated Lease . . . . . . . . . . . . . . 29
6.1. Summary of Additional Changes from Previous Documents . . 28 5.6. Migration and the Lease_time Attribute . . . . . . . . . . 31
6.2. NFS4ERR_CLID_INUSE definition . . . . . . . . . . . . . . 29 6. Additional Changes . . . . . . . . . . . . . . . . . . . . . . 32
6.3. Operation 35: SETCLIENTID - Negotiate Client ID . . . . . 29 6.1. Summary of Additional Changes from Previous Documents . . 32
6.4. Security Considerations revision . . . . . . . . . . . . . 33 6.2. NFS4ERR_CLID_INUSE definition . . . . . . . . . . . . . . 32
7. Security Considerations . . . . . . . . . . . . . . . . . . . 33 6.3. Operation 35: SETCLIENTID - Negotiate Client ID . . . . . 33
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 33 6.4. Security Considerations revision . . . . . . . . . . . . . 37
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 33 7. Security Considerations . . . . . . . . . . . . . . . . . . . 37
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 34 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 37
10.1. Normative References . . . . . . . . . . . . . . . . . . . 34 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 37
10.2. Informative References . . . . . . . . . . . . . . . . . . 34 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 34 10.1. Normative References . . . . . . . . . . . . . . . . . . . 37
10.2. Informative References . . . . . . . . . . . . . . . . . . 38
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 38
1. Introduction 1. Introduction
This document is a standards track document which corrects the This document is a standards track document which corrects the
existing definitive specification of the NFSv4.0 protocol, in existing definitive specification of the NFSv4.0 protocol, in
[RFC3530] and the one expected to become definitive (now in [RFC3530] and the one expected to become definitive (now in
[cur-rfc3530-bis]). Given this fact, one should take the current [cur-rfc3530-bis]). Given this fact, one should take the current
document into account when learning about NFSv4.0, particularly if document into account when learning about NFSv4.0, particularly if
one is concerned with issues that relate to: one is concerned with issues that relate to:
skipping to change at page 3, line 34 skipping to change at page 3, line 34
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
3. Background 3. Background
Implementation experience with transparent state migration has Implementation experience with transparent state migration has
exposed a number of problems with the existing specification of this exposed a number of problems with the existing specification of this
feature, in [RFC3530] and in RFC3530bis (see the draft at feature, in [RFC3530] and in RFC3530bis (see the draft at
[cur-rfc3530-bis]). The symptoms were: [cur-rfc3530-bis]). The symptoms were:
o After migration of a file system, a reboot of the associated o After migration of a filesystem, a reboot of the associated client
client was not appropriately dealt with, in that the state was not appropriately dealt with, in that the state associated
associated with the rebooting client was not promptly freed. with the rebooting client was not promptly freed.
o Situations can arise whereby a given server has multiple leases o Situations can arise whereby a given server has multiple leases
with the same nfs_client_id4 (id and verifier), when the protocol with the same nfs_client_id4 (id and verifier), when the protocol
clearly assumes there can be only one. clearly assumes there can be only one.
o Excessive client implementation complexity since clients have to o Excessive client implementation complexity since clients have to
deal with situations in which a single client can wind up with its deal with situations in which a single client can wind up with its
locking state with a given server divided among multiple leases locking state with a given server divided among multiple leases
each with its own clientid4. each with its own clientid4.
skipping to change at page 9, line 46 skipping to change at page 9, line 46
stateids are still valid. stateids are still valid.
The client must also employ the SETCLIENTID operation when it The client must also employ the SETCLIENTID operation when it
receives a NFS4ERR_STALE_STATEID error using a stateid derived from receives a NFS4ERR_STALE_STATEID error using a stateid derived from
its current clientid4, since this indicates a situation, such as its current clientid4, since this indicates a situation, such as
server reboot which has invalidated the existing clientid4 and server reboot which has invalidated the existing clientid4 and
associated stateids (see the section entitled "lock-owner" for associated stateids (see the section entitled "lock-owner" for
details). details).
See the detailed descriptions of SETCLIENTID and SETCLIENTID_CONFIRM See the detailed descriptions of SETCLIENTID and SETCLIENTID_CONFIRM
for a complete specification of the operations. for a complete specification of these operations.
4.3. Server Release of Client ID 4.3. Server Release of Client ID
If the server determines that the client holds no associated state If the server determines that the client holds no associated state
for its clientid4, the server may choose to release that clientid4. for its clientid4, the server may choose to release that clientid4.
The server may make this choice for an inactive client so that The server may make this choice for an inactive client so that
resources are not consumed by those intermittently active clients. resources are not consumed by those intermittently active clients.
If the client contacts the server after this release, the server must If the client contacts the server after this release, the server must
ensure the client receives the appropriate error so that it will use ensure the client receives the appropriate error so that it will use
the SETCLIENTID/SETCLIENTID_CONFIRM sequence to establish a new the SETCLIENTID/SETCLIENTID_CONFIRM sequence to establish a new
skipping to change at page 13, line 50 skipping to change at page 13, line 50
o In [RFC3530], the client is told to change the boot verifier when o In [RFC3530], the client is told to change the boot verifier when
reboot occurs, but there is no explicit statement as to the reboot occurs, but there is no explicit statement as to the
converse, so that any requirement to keep the verifier constant converse, so that any requirement to keep the verifier constant
unless rebooting is only present by implication. unless rebooting is only present by implication.
o Many existing clients change the boot verifier every time they o Many existing clients change the boot verifier every time they
destroy and recreate the data structure that tracks an <IP- destroy and recreate the data structure that tracks an <IP-
address, clientid4> pair. This might happen if the last mount of address, clientid4> pair. This might happen if the last mount of
a particular server is removed, and then a fresh mount is created. a particular server is removed, and then a fresh mount is created.
And, note that this might result in each <IP-address, clientid4> Also, note that this might result in each <IP-address, clientid4>
pair having its own boot verifier that is independent of the pair having its own boot verifier that is independent of the
others. others.
o Within the uniform client id string approach, an nfs_client_id4 o Within the uniform client id string approach, an nfs_client_id4
designates a globally known client instance, so that the boot designates a globally known client instance, so that the boot
verifier should change if and only if a new client instance is verifier should change if and only if a new client instance is
created, typically as a result of a reboot. created, typically as a result of a reboot.
The following are advantages for the implementation of using the The following are advantages for the implementation of using the
uniform client id string approach: uniform client id string approach:
skipping to change at page 15, line 49 skipping to change at page 15, line 49
uniform client id string approach uniform client id string approach
One effective way for clients to handle this is to support the One effective way for clients to handle this is to support the
uniform client id string approach as the default, but allow a mount uniform client id string approach as the default, but allow a mount
option to specify use of the non-uniform client id string approach option to specify use of the non-uniform client id string approach
for particular mount points, as long as such mount points are not for particular mount points, as long as such mount points are not
used when migration is to be supported. used when migration is to be supported.
In the case in which the same server has multiple mounts, and both In the case in which the same server has multiple mounts, and both
approaches are specified for the same server, the client could have approaches are specified for the same server, the client could have
multiple clientids corresponding to the same server, one for each multiple clientid's corresponding to the same server, one for each
approach and would then have to keep these separate. approach and would then have to keep these separate.
4.8. Trunking Determination when using Uniform client id strings 4.8. Trunking Determination when using Uniform client id strings
This section provides an example of how trunking determination could This section provides an example of how trunking determination could
be done by a client following the uniform client id string approach be done by a client following the uniform client id string approach
(whether this is used for all mounts or not). Clients need not (whether this is used for all mounts or not). Clients need not
follow this procedure but implementers should make sure that the follow this procedure but implementers should make sure that the
issues dealt with by this procedure are all properly addressed. issues dealt with by this procedure are all properly addressed.
skipping to change at page 18, line 46 skipping to change at page 18, line 46
server as a new IP address to be added to an existing set of IP server as a new IP address to be added to an existing set of IP
addresses for that server. Otherwise, it will be recognized as a addresses for that server. Otherwise, it will be recognized as a
new server. At the point at which this determination is made, the new server. At the point at which this determination is made, the
unresolved indication is cleared and any suspended SETCLIENTID unresolved indication is cleared and any suspended SETCLIENTID
processing is restarted processing is restarted
So for each lead IP address IPn with a clientid4 matching XC, the So for each lead IP address IPn with a clientid4 matching XC, the
following steps are done. following steps are done.
o If the principal for IPn does not match that for X, the IP address o If the principal for IPn does not match that for X, the IP address
is skipped, since it is impossible or IPn and X to be trunked in is skipped, since it is impossible for IPn and X to be trunked in
these circumstances. If the principal does match but the these circumstances. If the principal does match but the
authentication flavor does not, the authentication flavor already authentication flavor does not, the authentication flavor already
used should be used for address X as well. This will avoid any used should be used for address X as well. This will avoid any
possibility that NFS4ERR_CLID_INUSE will be returned for the possibility that NFS4ERR_CLID_INUSE will be returned for the
SETCLIENTID and SETCLIENTID_CONFIRM to be done below, as long as SETCLIENTID and SETCLIENTID_CONFIRM to be done below, as long as
the server(s) at IP addresses IPn and X are correctly implemented. the server(s) at IP addresses IPn and X are correctly implemented.
o A SETCLIENTID is done to update the callback parameters to reflect o A SETCLIENTID is done to update the callback parameters to reflect
the possibility that X will be marked as associated with the the possibility that X will be marked as associated with the
server whose lead IP address is IPn. The specific callback server whose lead IP address is IPn. The specific callback
skipping to change at page 22, line 33 skipping to change at page 22, line 33
o Adding text to address the case of stateid conflict on migration. o Adding text to address the case of stateid conflict on migration.
o Specifying that when leases are moved, as a result of filesystem o Specifying that when leases are moved, as a result of filesystem
migration, they are to be merged with leases on the destination migration, they are to be merged with leases on the destination
server that are connected to the same client. server that are connected to the same client.
o Adding text that deals with the case of a clientid4 being changed o Adding text that deals with the case of a clientid4 being changed
on state transfer as a result of conflict with an existing on state transfer as a result of conflict with an existing
clientid4. clientid4.
o Adding a section describing how information associated with
openowners and lockowners is to be managed with regard to
migration.
o The description of handling of the NFS4ERR_LEASE_MOVED has been o The description of handling of the NFS4ERR_LEASE_MOVED has been
rewritten for greater clarity. rewritten for greater clarity.
5.2. Lock State and File System Transitions 5.2. Lock State and Filesystem Transitions
When responsibility for handling a given filesystem is transferred to When responsibility for handling a given filesystem is transferred to
a new server (migration) or the client chooses to use an alternate a new server (migration) or the client chooses to use an alternate
server (e.g., in response to server unresponsiveness) in the context server (e.g., in response to server unresponsiveness) in the context
of filesystem replication, the appropriate handling of state shared of filesystem replication, the appropriate handling of state shared
between the client and server (i.e., locks, leases, stateids, and between the client and server (i.e., locks, leases, stateids, and
client IDs) is as described below. The handling differs between client IDs) is as described below. The handling differs between
migration and replication. migration and replication.
If a server replica or a server immigrating a filesystem agrees to, If a server replica or a server immigrating a filesystem agrees to,
skipping to change at page 23, line 30 skipping to change at page 23, line 34
If transferring stateids from server to server would result in a If transferring stateids from server to server would result in a
conflict for an existing stateid for the destination server with the conflict for an existing stateid for the destination server with the
existing client, transparent state migration MUST NOT happen for that existing client, transparent state migration MUST NOT happen for that
client. Servers participating in using transparent state migration client. Servers participating in using transparent state migration
should co-ordinate their stateid assignment policies to make this should co-ordinate their stateid assignment policies to make this
situation unlikely or impossible. The means by which this might be situation unlikely or impossible. The means by which this might be
done, like all of the inter-server interactions for migration, are done, like all of the inter-server interactions for migration, are
not specified by the NFS version 4.0 protocol. not specified by the NFS version 4.0 protocol.
Handling of clientid values is similar but not identical. The A client may determine the disposition of migrated state by using a
clientid4 and nfs_client_id4 information (id string and boot stateid associated with the migrated state on the new server.
o If the stateid is not valid and an error NFS4ERR_BAD_STATEID is
received, either transparent state migration has not occurred or
the state was purged due to boot verifier mismatch.
o If the stateid is valid, transparent state migration has occurred.
Since responsibility for an entire filesystem is transferred with a
migration event, there is no possibility that conflicts will arise on
the destination server as a result of the transfer of locks.
The servers may choose not to transfer the state information upon
migration. However, this choice is discouraged, except where
specific issues such as stateid conflicts make it necessary. In the
case of migration without state transfer, when the client presents
state information from the original server (e.g. in a RENEW op or a
READ op of zero length), the client must be prepared to receive
either NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID from the new
server. The client should then recover its state information as it
normally would in response to a server failure. The new server must
take care to allow for the recovery of state information as it would
in the event of server restart.
In those situations in which state has not been transferred, as shown
by a return of NFS4ERR_BAD_STATEID, the client may attempt to reclaim
locks in order to take advantage of cases in which the destination
server has set up a file-system-specific grace period in support of
the migration.
5.3.1. Migration and clientid's
Handling of clientid values is similar to that for stateids.
However, there are some differences that derive from the fact that a
clientid is an object which spans multiple filesystems while a
stateid is inherently limited to a single filesystem.
The clientid4 and nfs_client_id4 information (id string and boot
verifier) will be transferred with the rest of the state information verifier) will be transferred with the rest of the state information
and the destination server should use that information to determine and the destination server should use that information to determine
appropriate clientid4 handling. Although the destination server may appropriate clientid4 handling. Although the destination server may
make state stored under an existing lease available under the make state stored under an existing lease available under the
clientid4 used on the source server, the client should not assume clientid4 used on the source server, the client should not assume
that this is always so. In particular, that this is always so. In particular,
o If there is an existing lease with an nfs_client_id4 that matches o If there is an existing lease with an nfs_client_id4 that matches
a migrated lease (same id string and boot verifier), the server a migrated lease (same id string and boot verifier), the server
SHOULD merge the two, making the union of the sets of stateids SHOULD merge the two, making the union of the sets of stateids
skipping to change at page 24, line 13 skipping to change at page 25, line 6
verifiers are not ordered, the later lease renewal time will verifiers are not ordered, the later lease renewal time will
prevail. prevail.
o If the destination server already has the transferred clientid4 in o If the destination server already has the transferred clientid4 in
use for another purpose, it is free to substitute a different use for another purpose, it is free to substitute a different
clientid4 and associate that with the transferred nfs_client_id4. clientid4 and associate that with the transferred nfs_client_id4.
When leases are not merged, the transfer of state should result in When leases are not merged, the transfer of state should result in
creation of a confirmed client record with empty callback information creation of a confirmed client record with empty callback information
but matching the {v, x, c} with v and x derived from the transferred but matching the {v, x, c} with v and x derived from the transferred
client information and c chosen by the destination server. This client information and c chosen by the destination server.
should enable establishment of new callback information using
SETCLIENTID and SETCLIENTID_CONFIRM. The client can determine the In such cases, the client SHOULD re-establish new callback
information with the new server as soon as possible, according to
sequences described in sections "Operation 35: SETCLIENTID -
Negotiate Client ID" and "Operation 36: SETCLIENTID_CONFIRM - Confirm
Client ID". This ensures that server operations are not delayed due
to an inability to recall delegations. The client can determine the
new clientid (the value c) from the response to SETCLIENTID. new clientid (the value c) from the response to SETCLIENTID.
A client may determine the disposition of migrated state by using a The client can use its own information about leases with the
stateid associated with the migrated state on the new server. destination server to see if lease merger should have happened. When
there is any ambiguity, the client MAY use the above procedure to set
the proper callback information and find out, as part of the process,
the correct value of its clientid with respect to the server in
question.
o If the stateid is not valid and an error NFS4ERR_BAD_STATEID is 5.3.2. Migration and state owner information
received, either transparent state migration has not occurred or
the state was purged due to boot verifier mismatch.
o If the stateid is valid, transparent state migration has occurred. In addition to stateids, the locks they represent, and clientid
information, servers also need to transfer information related to the
current status of openowners and lockowners.
Since responsibility for an entire filesystem is transferred with a This information includes:
migration event, there is no possibility that conflicts will arise on
the destination server as a result of the transfer of locks.
The servers may choose not to transfer the state information upon o The sequence number of the last operation associated with the
migration. However, this choice is discouraged, except where particular owner.
specific issues such as stateid conflicts make it necessary. In the
case of migration without state transfer, when the client presents
state information from the original server (e.g. in a RENEW op or a
READ op of zero length), the client must be prepared to receive
either NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID from the new
server. The client should then recover its state information as it
normally would in response to a server failure. The new server must
take care to allow for the recovery of state information as it would
in the event of server restart.
When a lease is transferred to a new server (as opposed to being o Information regarding the results of the last operation,
merged with a lease already on the new server), a client SHOULD re- sufficient to allow reissued operations to be correctly responded
establish new callback information with the new server as soon as to.
possible, according to sequences described in sections "Operation 35:
SETCLIENTID - Negotiate Client ID" and "Operation 36:
SETCLIENTID_CONFIRM - Confirm Client ID". This ensures that server
operations are not delayed due to an inability to recall delegations.
In those situations in which state has not been transferred, as shown When clients are implemented to isolate each openowner and lockowner
by a return of NFS4ERR_BAD_STATEID, the client may attempt to reclaim to a particular filesystem, the server may transfer this information
locks in order to take advantage of cases in which the destination together with the lock state. The owner ceases to exist on the
server has set up a file-system-specific grace period in support of source server and is reconstituted on the destination server.
the migration.
Note that when servers take this approach for all owners whose state
is limited to the particular filesystem being migrated, doing so will
not cause difficulties for clients not adhering to an approach in
which owners are isolated to particular filesystems. As long as the
client recognizes the loss of transferred state, the protocol allows
the owner in question to disappear and the client may have to deal
with an owner confirmation request.
When migration occurs and the source server discovers an owner whose
state includes the migrated filesystem but other filesystems as well,
it MAY decline to transfer the associated state. In this case, use
of the associated stateids on the destination server will result in
NFS4ERR_BAD_STATEID while their use on the source server will result
in NFS4ERR_STALE_STATEID. Also, the owner will remain on the source
server and that server needs to ensure that a request reissue
associated with the migrated filesystem is not inappropriately acted
on. For example, a reissued OPEN for a file on the migrated
filesystem should not return a stateid that has been made invalid on
the server returning it, as a consequence of the migration event.
The source server MAY choose to migrate state associated with owners
that span multiple filesystems. In such cases, it needs to propagate
the owner sequence value to the destination server, while retaining
it on the source server, as long as there exists state associated
with the owner. When owner information is propagated in this way,
source and destination servers start with the same owner sequence
value which is then updated independently, as the client makes owner-
related requests to the servers. Note that each server will have
some period in which the associated sequence value for an owner is
identical to the one transferred as part of migration. At those
times, when a server receives a request with a matching owner
sequence value, it MUST NOT respond with the associated stored
response if the associated filesystem is not, when the reissued
request is received, part of the set of filesystems handled by that
server.
One sort of case may require more complex handling. When multiple
filesystem are migrated, in sequence, to a specific destination
server, an owner may be migrated to a destination server, on which it
was already present, leading to the issue of how the resident owner
information and that being newly migrated are to be reconciled.
If filesystem migration encounters a situation where owner
information needs to be merged, it MAY decline to transfer such
state, even if it chooses to handle other cases in which locks for a
given owner are spread among multiple filesystems.
As a way of understanding the situations which need to be addressed
when owner information needs to be merged, consider the following
scenario:
o There is client C and two servers X and Y. There are two
clientid's designating C, which we refer to as CX and CY.
o Initially server X supports filesystems F1, F2, F3, and F4. These
will be migrated, one-at-a-time, to server Y.
o While these migrations are proceeding, the client makes locking
requests for filesystem F1 through F4 on behalf of owner O (either
a lockowner or an openowner), with each request going to X or Y
depending on where the relevant filesystem is being supported at
the time the request is made.
o Once the first migration event occurs, client C will maintain two
instances for owner O, one for each server.
o It is always possible that C may make a request of server X
relating to owner O, and before receiving a response, find the
target filesystem has moved to Y, and need to re-issue the request
to server Y.
o At the same time, C may make a request of server Y relating to
owner O, and this too may encounter a lost-response situation.
As a result of such situations, the server needs to provide support
for dealing with retransmission of owner-sequenced requests that
diverges from the typical model in which there is support for
retransmission of replies only for a request whose sequence value
exactly matches the last one sent. Such support only needs to be
provided for requests issued before the migration event whose status
as the last by sequence is invalidated by the migration event.
When servers do support such merger of owner information on the
destination server, the following rules are to be adhered to:
o When an owner sequence value is propagated to a destination server
where it already exists, the resulting sequence value is to be the
greater of the one present on the destination server and the one
being propagated as part of migration.
o In the event that an owner sequence value on a server represents a
request applying to a filesystem currently present on the server,
it is not to be rendered invalid simply because that sequence
value is changed as a result of owner information propagation as
part of filesystem migration. Instead, it is retained until it
can be deduced that the client in question has received the reply.
As a result of the operation of these rules, there are three ways in
which we can have more reply data than what is typically present,
i.e. data for a single request per owner whose sequence is the last
one received, where the next sequence to be used is one beyond that.
o When the owner sequence value for a migrating filesystem is
greater than the corresponding value on the destination server,
the last request for the owner in effect at the destination server
needs to be retained, even though it is no longer one less the
next sequence to be received.
o When the owner sequence value for a migrating filesystem is less
than the corresponding value on the destination server the last
request for the owner in effect on the migrating filesystem needs
to be retained, even though it is no longer one less the next
sequence to be received.
o When the owner sequence value for a migrating filesystem is equal
to the corresponding value on the destination server, one has two
different "last" requests which both must be retained. The next
sequence value to be used is one beyond the sequence value shared
by these two requests.
Here are some guidelines as to when servers can drop such additional
reply data which is created as part of owner information migration.
o The server SHOULD NOT drop this information simply because it
receives a new sequence value for the owner in question, since
that request may have been issued before the client was aware of
the migration event.
o The server SHOULD drop this information if it receives a new
sequence value for the owner in question and the request relates
to the same filesystem.
o The server SHOULD drop the part of this information that relates
to non-migrated filesystems, if it receives a new sequence value
for the owner in question and the request relates to a non-
migrated filesystem.
o The server MAY drop this information when it receives a new
sequence value for the owner in question a considerable period of
time (more than one or two lease periods) after the migration
occurs.
5.4. Replication and State 5.4. Replication and State
Since client switch-over in the case of replication is not under Since client switch-over in the case of replication is not under
server control, the handling of state is different. In this case, server control, the handling of state is different. In this case,
leases, stateids and client IDs do not have validity across a leases, stateids and client IDs do not have validity across a
transition from one server to another. The client must re-establish transition from one server to another. The client must re-establish
its locks on the new server. This can be compared to the re- its locks on the new server. This can be compared to the re-
establishment of locks by means of reclaim-type requests after a establishment of locks by means of reclaim-type requests after a
server reboot. The difference is that the server has no provision to server reboot. The difference is that the server has no provision to
skipping to change at page 25, line 29 skipping to change at page 29, line 13
or to defer the latter. Thus, a client re-establishing a lock on the or to defer the latter. Thus, a client re-establishing a lock on the
new server (by means of a LOCK or OPEN request), may have the new server (by means of a LOCK or OPEN request), may have the
requests denied due to a conflicting lock. Since replication is requests denied due to a conflicting lock. Since replication is
intended for read-only use of filesystems, such denial of locks intended for read-only use of filesystems, such denial of locks
should not pose large difficulties in practice. When an attempt to should not pose large difficulties in practice. When an attempt to
re-establish a lock on a new server is denied, the client should re-establish a lock on a new server is denied, the client should
treat the situation as if its original lock had been revoked. treat the situation as if its original lock had been revoked.
5.5. Notification of Migrated Lease 5.5. Notification of Migrated Lease
A file system can be migrated to another server while a client that A filesystem can be migrated to another server while a client that
has state related to that filesystem is not actively submitting has state related to that filesystem is not actively submitting
requests to it. In this case, the migration is reported to the requests to it. In this case, the migration is reported to the
client during lease renewal. Lease renewal can occur either client during lease renewal. Lease renewal can occur either
explicitly via a RENEW operation, or implicitly when the client explicitly via a RENEW operation, or implicitly when the client
performs a lease-renewing operation on another file system on that performs a lease-renewing operation on another filesystem on that
server. server.
In order for the client to schedule renewal of leases that may have In order for the client to schedule renewal of leases that may have
been relocated to the new server, the client must find out about been relocated to the new server, the client must find out about
lease relocation before those leases expire. Similarly, when lease relocation before those leases expire. Similarly, when
migration occurs but there has not been transparent state migration, migration occurs but there has not been transparent state migration,
the client needs to find out about the change soon enough to be able the client needs to find out about the change soon enough to be able
to reclaim the lock within the destination server's grace period. To to reclaim the lock within the destination server's grace period. To
accomplish this, all operations which implicitly renew leases for a accomplish this, all operations which implicitly renew leases for a
client (such as OPEN, CLOSE, READ, WRITE, RENEW, LOCK, and others), client (such as OPEN, CLOSE, READ, WRITE, RENEW, LOCK, and others),
skipping to change at page 26, line 36 skipping to change at page 30, line 20
large numbers of filesystems is described below. This approach large numbers of filesystems is described below. This approach
divides the process into two phases, one devoted to finding the divides the process into two phases, one devoted to finding the
migrated filesystems and the second devoted to doing the necessary migrated filesystems and the second devoted to doing the necessary
GETATTRs. GETATTRs.
The client can find the migrated filesystems by building and issuing The client can find the migrated filesystems by building and issuing
one or more COMPOUND requests, each consisting of a set of PUTFH/ one or more COMPOUND requests, each consisting of a set of PUTFH/
GETFH pairs, each pair using an fh in one of the filesystems in GETFH pairs, each pair using an fh in one of the filesystems in
question. All such COMPOUND requests can be done in parallel. The question. All such COMPOUND requests can be done in parallel. The
successful completion of such a request indicates that none of the successful completion of such a request indicates that none of the
fs's interrogated have been migrated while termination with filesystems interrogated have been migrated while termination with
NFS4ERR_MOVED indicates that the filesystem getting the error has NFS4ERR_MOVED indicates that the filesystem getting the error has
migrated while those interrogated before it in the same COMPOUND have migrated while those interrogated before it in the same COMPOUND have
not. Those whose interrogation follows the error remain in an not. Those whose interrogation follows the error remain in an
uncertain state and can be interrogated by restarting the requests uncertain state and can be interrogated by restarting the requests
from after the point at which NFS4ERR_MOVED was returned or by from after the point at which NFS4ERR_MOVED was returned or by
issuing a new set of COMPOUND requests for the filesystems which issuing a new set of COMPOUND requests for the filesystems which
remain in an uncertain state. remain in an uncertain state.
Once the migrated filesystems have been found, all that is needed is Once the migrated filesystems have been found, all that is needed is
for the client to give evidence to the server that it is aware of the for the client to give evidence to the server that it is aware of the
skipping to change at page 27, line 32 skipping to change at page 31, line 16
of those leases on the new server. If the server has not had state of those leases on the new server. If the server has not had state
transferred to it transparently, the client will receive either transferred to it transparently, the client will receive either
NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID from the new server, NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID from the new server,
as described above. The client can then recover state information as as described above. The client can then recover state information as
it does in the event of server failure. it does in the event of server failure.
Aside from recovering from a migration, there are other reasons a Aside from recovering from a migration, there are other reasons a
client may wish to retrieve fs_locations information from a server. client may wish to retrieve fs_locations information from a server.
When a server becomes unresponsive, for example, a client may use When a server becomes unresponsive, for example, a client may use
cached fs_locations data to discover an alternate server hosting the cached fs_locations data to discover an alternate server hosting the
same fs data. A client may periodically request fs_locations data same filesystem data. A client may periodically request fs_locations
from a server in order to keep its cache of fs_locations data fresh. data from a server in order to keep its cache of fs_locations data
fresh.
Since a GETATTR(fs_locations) operation would be used for refreshing Since a GETATTR(fs_locations) operation would be used for refreshing
cached fs_locations data, a server could mistake such a request as cached fs_locations data, a server could mistake such a request as
indicating recognition of an NFS4ERR_LEASE_MOVED condition. indicating recognition of an NFS4ERR_LEASE_MOVED condition.
Therefore a compound which is not intended to signal that a client Therefore a compound which is not intended to signal that a client
has recognized a migrated lease SHOULD be prefixed with a guard has recognized a migrated lease SHOULD be prefixed with a guard
operation which fails with NFS4ERR_MOVED if the file handle being operation which fails with NFS4ERR_MOVED if the file handle being
queried is no longer present on the server. The guard can be as queried is no longer present on the server. The guard can be as
simple as a GETFH operation. simple as a GETFH operation.
Though unlikely, it is possible that the target of such a compound Though unlikely, it is possible that the target of such a compound
could be migrated in the time after the guard operation is executed could be migrated in the time after the guard operation is executed
on the server but before the GETATTR(fs_locations) operation is on the server but before the GETATTR(fs_locations) operation is
encountered. When a client issues a GETATTR(fs_locations) operation encountered. When a client issues a GETATTR(fs_locations) operation
as part of a compound not intended to signal recognition of a as part of a compound not intended to signal recognition of a
migrated lease, it SHOULD be prepared to process fs_locations data in migrated lease, it SHOULD be prepared to process fs_locations data in
the reply that shows the current location of the fs is gone. the reply that shows the current location of the filesystem is gone.
5.6. Migration and the Lease_time Attribute 5.6. Migration and the Lease_time Attribute
In order that the client may appropriately manage its leases in the In order that the client may appropriately manage its leases in the
case of migration, the destination server must establish proper case of migration, the destination server must establish proper
values for the lease_time attribute. values for the lease_time attribute.
When state is transferred transparently, that state should include When state is transferred transparently, that state should include
the correct value of the lease_time attribute. The lease_time the correct value of the lease_time attribute. The lease_time
attribute on the destination server must never be less than that on attribute on the destination server must never be less than that on
skipping to change at page 34, line 11 skipping to change at page 37, line 37
9. Acknowledgements 9. Acknowledgements
The editor and authors of this document gratefully acknowledge the The editor and authors of this document gratefully acknowledge the
contributions of Trond Myklebust of NetApp and Robert Thurlow of contributions of Trond Myklebust of NetApp and Robert Thurlow of
Oracle. We also thank Tom Haynes of NetApp and Spencer Shepler of Oracle. We also thank Tom Haynes of NetApp and Spencer Shepler of
Microsoft for their guidance and suggestions. Microsoft for their guidance and suggestions.
Special thanks go to members of the Oracle Solaris NFS team, Special thanks go to members of the Oracle Solaris NFS team,
especially Rick Mesta and James Wahlig, for their work implementing especially Rick Mesta and James Wahlig, for their work implementing
an NFSv4.0 migration prototype and identifying many of the issues an NFSv4.0 migration prototype and identifying many of the issues
documented here. addressed here.
10. References 10. References
10.1. Normative References 10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3530] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., [RFC3530] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R.,
Beame, C., Eisler, M., and D. Noveck, "Network File System Beame, C., Eisler, M., and D. Noveck, "Network File System
skipping to change at page 34, line 35 skipping to change at page 38, line 16
[RFC1813] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS [RFC1813] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS
Version 3 Protocol Specification", RFC 1813, June 1995. Version 3 Protocol Specification", RFC 1813, June 1995.
[RFC5661] Shepler, S., Eisler, M., and D. Noveck, "Network File [RFC5661] Shepler, S., Eisler, M., and D. Noveck, "Network File
System (NFS) Version 4 Minor Version 1 Protocol", System (NFS) Version 4 Minor Version 1 Protocol",
RFC 5661, January 2010. RFC 5661, January 2010.
[cur-rfc3530-bis] [cur-rfc3530-bis]
Haynes, T., Ed. and D. Noveck, Ed., "Network File System Haynes, T., Ed. and D. Noveck, Ed., "Network File System
(NFS) Version 4 Protocol", 2012, <http://www.ietf.org/id/ (NFS) Version 4 Protocol", 2013, <http://www.ietf.org/id/
draft-ietf-nfsv4-rfc3530bis-21.txt>. draft-ietf-nfsv4-rfc3530bis-24.txt>.
Work in progress. Work in progress.
[info-migr] [info-migr]
Noveck, D., Ed., Shivam, P., Lever, C., and B. Baker, Noveck, D., Ed., Shivam, P., Lever, C., and B. Baker,
"NFSv4 migration: Implementation experience and spec "NFSv4 migration: Implementation experience and spec
issues to resolve", 2012, <http://www.ietf.org/id/ issues to resolve", 2012, <http://www.ietf.org/id/
draft-ietf-nfsv4-migration-issues-02.txt>. draft-ietf-nfsv4-migration-issues-02.txt>.
Work in progress. Work in progress.
 End of changes. 30 change blocks. 
75 lines changed or deleted 254 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/