draft-ietf-precis-framework-15.txt   draft-ietf-precis-framework-16.txt 
PRECIS P. Saint-Andre PRECIS P. Saint-Andre
Internet-Draft &yet Internet-Draft &yet
Obsoletes: 3454 (if approved) M. Blanchet Obsoletes: 3454 (if approved) M. Blanchet
Intended status: Standards Track Viagenie Intended status: Standards Track Viagenie
Expires: September 15, 2014 March 14, 2014 Expires: October 23, 2014 April 21, 2014
PRECIS Framework: Preparation and Comparison of Internationalized PRECIS Framework: Preparation and Comparison of Internationalized
Strings in Application Protocols Strings in Application Protocols
draft-ietf-precis-framework-15 draft-ietf-precis-framework-16
Abstract Abstract
Application protocols using Unicode characters in protocol strings Application protocols using Unicode characters in protocol strings
need to properly prepare such strings in order to perform valid need to properly prepare such strings in order to perform valid
comparison operations (e.g., for purposes of authentication or comparison operations (e.g., for purposes of authentication or
authorization). This document defines a framework enabling authorization). This document defines a framework enabling
application protocols to perform the preparation and comparison of application protocols to perform the preparation and comparison of
internationalized strings ("PRECIS") in a way that depends on the internationalized strings ("PRECIS") in a way that depends on the
properties of Unicode characters and thus is agile with respect to properties of Unicode characters and thus is agile with respect to
skipping to change at page 1, line 42 skipping to change at page 1, line 42
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 15, 2014. This Internet-Draft will expire on October 23, 2014.
Copyright Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 20 skipping to change at page 2, line 20
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. String Classes . . . . . . . . . . . . . . . . . . . . . . . 5 3. String Classes . . . . . . . . . . . . . . . . . . . . . . . 5
3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 5 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2. IdentifierClass . . . . . . . . . . . . . . . . . . . . . 7 3.2. IdentifierClass . . . . . . . . . . . . . . . . . . . . . 7
3.3. FreeformClass . . . . . . . . . . . . . . . . . . . . . . 8 3.3. FreeformClass . . . . . . . . . . . . . . . . . . . . . . 9
4. Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4. Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.1. Principles . . . . . . . . . . . . . . . . . . . . . . . 10 4.1. Principles . . . . . . . . . . . . . . . . . . . . . . . 10
4.2. Building Application-Layer Constructs . . . . . . . . . . 12 4.2. Building Application-Layer Constructs . . . . . . . . . . 13
4.3. A Note about Spaces . . . . . . . . . . . . . . . . . . . 13 4.3. A Note about Spaces . . . . . . . . . . . . . . . . . . . 13
5. Order of Operations . . . . . . . . . . . . . . . . . . . . . 13 5. Order of Operations . . . . . . . . . . . . . . . . . . . . . 14
6. Code Point Properties . . . . . . . . . . . . . . . . . . . . 14 6. Code Point Properties . . . . . . . . . . . . . . . . . . . . 15
7. Category Definitions Used to Calculate Derived Property . . . 16 7. Category Definitions Used to Calculate Derived Property . . . 16
7.1. LetterDigits (A) . . . . . . . . . . . . . . . . . . . . 16 7.1. LetterDigits (A) . . . . . . . . . . . . . . . . . . . . 17
7.2. Unstable (B) . . . . . . . . . . . . . . . . . . . . . . 17 7.2. Unstable (B) . . . . . . . . . . . . . . . . . . . . . . 18
7.3. IgnorableProperties (C) . . . . . . . . . . . . . . . . . 17 7.3. IgnorableProperties (C) . . . . . . . . . . . . . . . . . 18
7.4. IgnorableBlocks (D) . . . . . . . . . . . . . . . . . . . 17 7.4. IgnorableBlocks (D) . . . . . . . . . . . . . . . . . . . 18
7.5. LDH (E) . . . . . . . . . . . . . . . . . . . . . . . . . 17 7.5. LDH (E) . . . . . . . . . . . . . . . . . . . . . . . . . 18
7.6. Exceptions (F) . . . . . . . . . . . . . . . . . . . . . 17 7.6. Exceptions (F) . . . . . . . . . . . . . . . . . . . . . 18
7.7. BackwardCompatible (G) . . . . . . . . . . . . . . . . . 19 7.7. BackwardCompatible (G) . . . . . . . . . . . . . . . . . 19
7.8. JoinControl (H) . . . . . . . . . . . . . . . . . . . . . 19 7.8. JoinControl (H) . . . . . . . . . . . . . . . . . . . . . 20
7.9. OldHangulJamo (I) . . . . . . . . . . . . . . . . . . . . 19 7.9. OldHangulJamo (I) . . . . . . . . . . . . . . . . . . . . 20
7.10. Unassigned (J) . . . . . . . . . . . . . . . . . . . . . 20 7.10. Unassigned (J) . . . . . . . . . . . . . . . . . . . . . 20
7.11. ASCII7 (K) . . . . . . . . . . . . . . . . . . . . . . . 20 7.11. ASCII7 (K) . . . . . . . . . . . . . . . . . . . . . . . 21
7.12. Controls (L) . . . . . . . . . . . . . . . . . . . . . . 20 7.12. Controls (L) . . . . . . . . . . . . . . . . . . . . . . 21
7.13. PrecisIgnorableProperties (M) . . . . . . . . . . . . . . 20 7.13. PrecisIgnorableProperties (M) . . . . . . . . . . . . . . 21
7.14. Spaces (N) . . . . . . . . . . . . . . . . . . . . . . . 21 7.14. Spaces (N) . . . . . . . . . . . . . . . . . . . . . . . 21
7.15. Symbols (O) . . . . . . . . . . . . . . . . . . . . . . . 21 7.15. Symbols (O) . . . . . . . . . . . . . . . . . . . . . . . 22
7.16. Punctuation (P) . . . . . . . . . . . . . . . . . . . . . 21 7.16. Punctuation (P) . . . . . . . . . . . . . . . . . . . . . 22
7.17. HasCompat (Q) . . . . . . . . . . . . . . . . . . . . . . 21 7.17. HasCompat (Q) . . . . . . . . . . . . . . . . . . . . . . 22
7.18. OtherLetterDigits (R) . . . . . . . . . . . . . . . . . . 22 7.18. OtherLetterDigits (R) . . . . . . . . . . . . . . . . . . 22
8. Calculation of the Derived Property . . . . . . . . . . . . . 22 8. Calculation of the Derived Property . . . . . . . . . . . . . 22
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24
9.1. PRECIS Derived Property Value Registry . . . . . . . . . 23 9.1. PRECIS Derived Property Value Registry . . . . . . . . . 24
9.2. PRECIS Base Classes Registry . . . . . . . . . . . . . . 23 9.2. PRECIS Base Classes Registry . . . . . . . . . . . . . . 24
9.3. PRECIS Profiles Registry . . . . . . . . . . . . . . . . 24 9.3. PRECIS Profiles Registry . . . . . . . . . . . . . . . . 25
10. Security Considerations . . . . . . . . . . . . . . . . . . . 26 10. Security Considerations . . . . . . . . . . . . . . . . . . . 26
10.1. General Issues . . . . . . . . . . . . . . . . . . . . . 26 10.1. General Issues . . . . . . . . . . . . . . . . . . . . . 26
10.2. Use of the IdentifierClass . . . . . . . . . . . . . . . 26 10.2. Use of the IdentifierClass . . . . . . . . . . . . . . . 27
10.3. Use of the FreeformClass . . . . . . . . . . . . . . . . 26 10.3. Use of the FreeformClass . . . . . . . . . . . . . . . . 27
10.4. Local Character Set Issues . . . . . . . . . . . . . . . 27 10.4. Local Character Set Issues . . . . . . . . . . . . . . . 27
10.5. Visually Similar Characters . . . . . . . . . . . . . . 27 10.5. Visually Similar Characters . . . . . . . . . . . . . . 28
10.6. Security of Passwords . . . . . . . . . . . . . . . . . 29 10.6. Security of Passwords . . . . . . . . . . . . . . . . . 30
11. Interoperability Considerations . . . . . . . . . . . . . . . 30 11. Interoperability Considerations . . . . . . . . . . . . . . . 30
12. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 31
12.1. Normative References . . . . . . . . . . . . . . . . . . 30 12.1. Normative References . . . . . . . . . . . . . . . . . . 31
12.2. Informative References . . . . . . . . . . . . . . . . . 30 12.2. Informative References . . . . . . . . . . . . . . . . . 31
12.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 33 12.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Appendix A. Codepoint Table . . . . . . . . . . . . . . . . . . 33 Appendix A. Codepoint Table . . . . . . . . . . . . . . . . . . 34
Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 64 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 64
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 64 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 65
1. Introduction 1. Introduction
As described in the problem statement for the preparation and As described in the problem statement for the preparation and
comparison of internationalized strings ("PRECIS") [RFC6885], many comparison of internationalized strings ("PRECIS") [RFC6885], many
IETF protocols have used the Stringprep framework [RFC3454] as the IETF protocols have used the Stringprep framework [RFC3454] as the
basis for preparing and comparing protocol strings that contain basis for preparing and comparing protocol strings that contain
Unicode characters [UNICODE] outside the ASCII range [RFC20]. The Unicode characters [UNICODE] outside the ASCII range [RFC20]. The
Stringprep framework was developed during work on the original Stringprep framework was developed during work on the original
technology for internationalized domain names (IDNs), here called technology for internationalized domain names (IDNs), here called
skipping to change at page 4, line 42 skipping to change at page 4, line 42
[I-D.ietf-precis-saslprepbis], nicknames [I-D.ietf-precis-nickname], [I-D.ietf-precis-saslprepbis], nicknames [I-D.ietf-precis-nickname],
the localparts of instant messaging addresses the localparts of instant messaging addresses
[I-D.ietf-xmpp-6122bis], and free-form strings [I-D.ietf-xmpp-6122bis], and free-form strings
[I-D.ietf-xmpp-6122bis]. Profiles are responsible for defining the [I-D.ietf-xmpp-6122bis]. Profiles are responsible for defining the
handling of right-to-left characters as well as various mapping handling of right-to-left characters as well as various mapping
operations of the kind also discussed for IDNs in [RFC5895], such as operations of the kind also discussed for IDNs in [RFC5895], such as
case preservation or lowercasing, Unicode normalization, mapping of case preservation or lowercasing, Unicode normalization, mapping of
certain characters to other characters or to nothing, and mapping of certain characters to other characters or to nothing, and mapping of
full-width and half-width characters. full-width and half-width characters.
When an application applies a profile of a PRECIS string class, it
can achieve the following objectives:
a. Determine if a given string conforms to the profile (e.g. to
determine if it is allowed for use in the relevant "slot"
specified by an application protocol).
b. Determine if any two given strings are equivalent (e.g., to make
an access decision for purposes of authentication or
authorization as further described in [RFC6943]).
It is expected that this framework will yield the following benefits: It is expected that this framework will yield the following benefits:
o Application protocols will be agile with regard to Unicode o Application protocols will be agile with regard to Unicode
versions. versions.
o Implementers will be able to share code point tables and software o Implementers will be able to share code point tables and software
code across application protocols, most likely by means of code across application protocols, most likely by means of
software libraries. software libraries.
o End users will be able to acquire more accurate expectations about o End users will be able to acquire more accurate expectations about
skipping to change at page 5, line 37 skipping to change at page 6, line 4
As of the date of writing, the version of Unicode published by the As of the date of writing, the version of Unicode published by the
Unicode Consortium is 6.3; however, PRECIS is not tied to a specific Unicode Consortium is 6.3; however, PRECIS is not tied to a specific
version of Unicode. version of Unicode.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in
[RFC2119]. [RFC2119].
3. String Classes 3. String Classes
3.1. Overview 3.1. Overview
IDNA2008 essentially defines a string class of internationalized
domain name (IDN), although it does not use the term "string class".
(This document does not define a string class for domain names, and
application protocols are strongly encouraged to use IDNA2008 as the
appropriate method to prepare domain names and hostnames.) Because
the IDN string class is designed to meet the particular requirements
of the Domain Name System (DNS), additional string classes are needed
for non-DNS applications.
Starting in 2010, various "customers" of Stringprep began to discuss Starting in 2010, various "customers" of Stringprep began to discuss
the need to define a post-Stringprep approach to the preparation and the need to define a post-Stringprep approach to the preparation and
comparison of internationalized strings other than IDNs. This comparison of internationalized strings other than IDNs. This
community analyzed the existing Stringprep profiles and also weighed community analyzed the existing Stringprep profiles and also weighed
the costs and benefits of defining a relatively small set of Unicode the costs and benefits of defining a relatively small set of Unicode
characters that would minimize the potential for user confusion characters that would minimize the potential for user confusion
caused by visually similar characters (and thus be relatively "safe") caused by visually similar characters (and thus be relatively "safe")
vs. defining a much larger set of Unicode characters that would vs. defining a much larger set of Unicode characters that would
maximize the potential for user creativity (and thus be relatively maximize the potential for user creativity (and thus be relatively
"expressive"). As a result, the community concluded that most "expressive"). As a result, the community concluded that most
skipping to change at page 7, line 39 skipping to change at page 7, line 43
o Code points traditionally used as letters and numbers in writing o Code points traditionally used as letters and numbers in writing
systems, i.e., the LetterDigits ("A") category first defined in systems, i.e., the LetterDigits ("A") category first defined in
[RFC5892] and listed here under Section 7.1. [RFC5892] and listed here under Section 7.1.
o Code points in the range U+0021 through U+007E, i.e., the o Code points in the range U+0021 through U+007E, i.e., the
(printable) ASCII7 ("K") rule defined under Section 7.11. These (printable) ASCII7 ("K") rule defined under Section 7.11. These
code points are "grandfathered" into PRECIS and thus are valid code points are "grandfathered" into PRECIS and thus are valid
even if they would otherwise be disallowed according to the even if they would otherwise be disallowed according to the
property-based rules specified in the next section. property-based rules specified in the next section.
Informational Note: Although the PRECIS IdentifierClass re-uses Note: Although the PRECIS IdentifierClass re-uses the LetterDigits
the LetterDigits category from IDNA2008, the range of characters category from IDNA2008, the range of characters allowed in the
allowed in the IdentifierClass is wider than the range of IdentifierClass is wider than the range of characters allowed in
characters allowed in IDNA2008. The main reason is that IDNA2008 IDNA2008. The main reason is that IDNA2008 applies the Unstable
applies the Unstable category before the LetterDigits category, category before the LetterDigits category, thus disallowing
thus disallowing uppercase characters, whereas the IdentifierClass uppercase characters, whereas the IdentifierClass does not apply
does not apply the Unstable category. the Unstable category.
3.2.2. Contextual Rule Required 3.2.2. Contextual Rule Required
o A number of characters from the Exceptions ("F") category defined o A number of characters from the Exceptions ("F") category defined
under Section 7.6 (see Section 7.6 for a full list). under Section 7.6 (see Section 7.6 for a full list).
o Joining characters, i.e., the JoinControl ("H") category defined o Joining characters, i.e., the JoinControl ("H") category defined
under Section 7.8. under Section 7.8.
3.2.3. Disallowed 3.2.3. Disallowed
skipping to change at page 8, line 41 skipping to change at page 8, line 46
according to the property-based rules specified in the previous according to the property-based rules specified in the previous
section. section.
o Letters and digits other than the "traditional" letters and digits o Letters and digits other than the "traditional" letters and digits
allowed in IDNs, i.e., the OtherLetterDigits ("R") category allowed in IDNs, i.e., the OtherLetterDigits ("R") category
defined under Section 7.18. defined under Section 7.18.
3.2.4. Unassigned 3.2.4. Unassigned
Any code points that are not yet designated in the Unicode character Any code points that are not yet designated in the Unicode character
set SHALL be considered Unassigned for purposes of the set are considered Unassigned for purposes of the IdentifierClass,
IdentifierClass, and a string containing such code points SHALL be and such code points are to be treated as Disallowed.
rejected.
3.2.5. Examples
As described in the Introduction to this document, the string classes
do not handle all issues related to string preparation and comparison
(such as case mapping); instead, such issues are handled at the level
of profiles. Examples for two profiles of the IdentifierClass can be
found in [I-D.ietf-precis-saslprepbis] (the UsernameIdentifierClass
profile) and in [I-D.ietf-xmpp-6122bis] (the JIDlocalIdentifierClass
profile).
3.3. FreeformClass 3.3. FreeformClass
Some application technologies need strings that can be used in a Some application technologies need strings that can be used in a
free-form way, e.g., as a password in an authentication exchange (see free-form way, e.g., as a password in an authentication exchange (see
[I-D.ietf-precis-saslprepbis] or a nickname in a chatroom (see [I-D.ietf-precis-saslprepbis] or a nickname in a chatroom (see
[I-D.ietf-precis-nickname]). We group such things into a class [I-D.ietf-precis-nickname]). We group such things into a class
called "FreeformClass" having the following features. called "FreeformClass" having the following features.
Security Warning: Consult Section 10.6 for relevant security Security Warning: Consult Section 10.6 for relevant security
skipping to change at page 10, line 8 skipping to change at page 10, line 27
o Control characters, i.e., the Controls ("L") category defined o Control characters, i.e., the Controls ("L") category defined
under Section 7.12. under Section 7.12.
o Ignorable characters, i.e., the PrecisIgnorableProperties ("M") o Ignorable characters, i.e., the PrecisIgnorableProperties ("M")
category defined under Section 7.13. category defined under Section 7.13.
3.3.4. Unassigned 3.3.4. Unassigned
Any code points that are not yet designated in the Unicode character Any code points that are not yet designated in the Unicode character
set SHALL be considered Unassigned for purposes of the FreeformClass, set are considered Unassigned for purposes of the FreeformClass, and
and a string containing such code points SHALL be rejected. such code points are to be treated as Disallowed.
3.3.5. Examples
As described in the Introduction to this document, the string classes
do not handle all issues related to string preparation and comparison
(such as case mapping); instead, such issues are handled at the level
of profiles. Examples for two profiles of the FreeformClass can be
found in [I-D.ietf-precis-nickname] (the NicknameFreeformClass
profile) and in [I-D.ietf-xmpp-6122bis] (the
JIDresourceIdentifierClass profile).
4. Profiles 4. Profiles
4.1. Principles 4.1. Principles
This framework document defines the valid, contextual-rule-required, This framework document defines the valid, contextual-rule-required,
disallowed, and unassigned rules for the IdentifierClass and the disallowed, and unassigned rules for the IdentifierClass and the
FreeformClass. A profile of a PRECIS string class MUST define the FreeformClass. A profile of a PRECIS string class MUST define the
width mapping, additional mappings (if any), case mapping, width mapping, additional mappings (if any), case mapping,
normalization, directionality, and exclusion rules. A profile MAY normalization, directionality, and exclusion rules. A profile MAY
also restrict the allowable characters above and beyond the also restrict the allowable characters above and beyond the
definition of the relevant PRECIS string class (but MUST NOT add as definition of the relevant PRECIS string class (but MUST NOT add as
valid any code points or character categories that are disallowed by valid any code points or character categories that are disallowed by
the relevant PRECIS string class). These matters are discussed in the relevant PRECIS string class). These matters are discussed in
the following subsections. the following subsections.
Profiles of the PRECIS string classes MUST register with the IANA as Profiles of the PRECIS string classes are registered with the IANA as
described under Section 9.3. It is RECOMMENDED for profile names to described under Section 9.3. The naming convention for profile names
be of the form "ProfilenameBaseClass", where the "Profilename" string is that they of the form "ProfilenameBaseClass", where the
is a differentiator and "BaseClass" is the name of the PRECIS string "Profilename" string is a differentiator and "BaseClass" is the name
class being profiled; for example, the profile of the IdentifierClass of the PRECIS string class being profiled; for example, the profile
used for localparts of Jabber IDs in the Extensible Messaging and of the IdentifierClass used for localparts of Jabber IDs in the
Presence Protocol (XMPP) is named "JIDlocalIdentifierClass" Extensible Messaging and Presence Protocol (XMPP) is named
[I-D.ietf-xmpp-6122bis]. "JIDlocalIdentifierClass" [I-D.ietf-xmpp-6122bis].
4.1.1. Width Mapping 4.1.1. Width Mapping
The width mapping rule of a profile specifies whether width mapping The width mapping rule of a profile specifies whether width mapping
is performed on fullwidth and halfwidth characters, and how the is performed on fullwidth and halfwidth characters, and how the
mapping is done. Typically such mapping consists of mapping mapping is done. Typically such mapping consists of mapping
fullwidth and halfwidth characters, i.e., code points with a fullwidth and halfwidth characters, i.e., code points with a
Decomposition Type of Wide or Narrow, to their decomposition Decomposition Type of Wide or Narrow, to their decomposition
mappings; as an example, FULLWIDTH DIGIT ZERO (U+FF10) would be mappings; as an example, FULLWIDTH DIGIT ZERO (U+FF10) would be
mapped to DIGIT ZERO (U+0030). mapped to DIGIT ZERO (U+0030).
skipping to change at page 11, line 27 skipping to change at page 12, line 9
The case mapping rule of a profile specifies whether case mapping is The case mapping rule of a profile specifies whether case mapping is
performed (instead of case preservation) on uppercase and titlecase performed (instead of case preservation) on uppercase and titlecase
characters, and how the mapping is done (e.g., mapping uppercase and characters, and how the mapping is done (e.g., mapping uppercase and
titlecase characters to their lowercase equivalents). titlecase characters to their lowercase equivalents).
If case mapping is desired (instead of case preservation), it is If case mapping is desired (instead of case preservation), it is
RECOMMENDED to use Unicode Default Case Folding as defined in Chapter RECOMMENDED to use Unicode Default Case Folding as defined in Chapter
3 of the Unicode Standard [UNICODE]. 3 of the Unicode Standard [UNICODE].
Informational Note: Unicode Default Case Folding is not designed Note: Unicode Default Case Folding is not designed to handle
to handle various localization issues (such as so-called "dotless various localization issues (such as so-called "dotless i" in
i" in several Turkic languages). The PRECIS mappings document several Turkic languages). The PRECIS mappings document
[I-D.ietf-precis-mappings] describes these issues in greater [I-D.ietf-precis-mappings] describes these issues in greater
detail and defines a "local case mapping" method that handles some detail and defines a "local case mapping" method that handles some
locale-dependent and context-dependent mappings. locale-dependent and context-dependent mappings.
In order to maximize entropy and minimize the potential for false In order to maximize entropy and minimize the potential for false
positives, it is NOT RECOMMENDED for application protocols to map positives, it is NOT RECOMMENDED for application protocols to map
uppercase and titlecase code points to their lowercase equivalents uppercase and titlecase code points to their lowercase equivalents
when strings conforming to the FreeformClass, or a profile thereof, when strings conforming to the FreeformClass, or a profile thereof,
are used in passwords; instead, it is RECOMMENDED to preserve the are used in passwords; instead, it is RECOMMENDED to preserve the
case of all code points contained in such strings and then perform case of all code points contained in such strings and then perform
skipping to change at page 12, line 10 skipping to change at page 12, line 39
Standard Annex #15 [UAX15] for background information). Standard Annex #15 [UAX15] for background information).
In accordance with [RFC5198], normalization form C (NFC) is In accordance with [RFC5198], normalization form C (NFC) is
RECOMMENDED. RECOMMENDED.
4.1.5. Directionality 4.1.5. Directionality
The directionality rule of a profile specifies which strings are to The directionality rule of a profile specifies which strings are to
be considered left-to-right (LTR) and right-to-left (RTL), and the be considered left-to-right (LTR) and right-to-left (RTL), and the
allowable sequences of characters in LTR and RTL strings (see Unicode allowable sequences of characters in LTR and RTL strings (see Unicode
Standard Annex #9 [UAX9]); note that mixed-direction strings are not Standard Annex #9 [UAX9]). Possible rules include, but are not
supported, since there is currently no widely accepted and limited to, (a) considering any string that contains a right-to-left
implemented solution for the processing and display of mixed- code point to be a right-to-left string, or (b) applying the "Bidi
direction strings. Possible rules include, but are not limited to, Rule" from [RFC5893].
(a) considering any string that contains a right-to-left code point
to be a right-to-left string, or (b) applying the "Bidi Rule" from Mixed-direction strings are not directly supported by the PRECIS
[RFC5893]. framework itself, since there is currently no widely accepted and
implemented solution for the processing and safe display of mixed-
direction strings. An application protocol that uses the PRECIS
framework (or an extension to the framework) could define methods for
handling mixed-direction strings; however, such methods are outside
the scope of the framework.
4.1.6. Exclusions 4.1.6. Exclusions
The exclusions rule of a profile specifies whether the profile The exclusions rule of a profile specifies whether the profile
excludes additional code points or character categories above and excludes additional code points or character categories above and
beyond those excluded by the string class being profiled. That is, a beyond those excluded by the string class being profiled. That is, a
profile MAY do either of the following: profile MAY do either of the following:
1. Exclude specific code points that are allowed by the relevant 1. Exclude specific code points that are allowed by the relevant
string class. string class.
skipping to change at page 13, line 5 skipping to change at page 13, line 38
construct in the Simple Authentication and Security Layer (SASL) construct in the Simple Authentication and Security Layer (SASL)
[RFC4422]. Depending on the deployment, a simple user name might [RFC4422]. Depending on the deployment, a simple user name might
take the form of a user's full name (e.g., the user's personal name take the form of a user's full name (e.g., the user's personal name
followed by a space and then the user's family name). Such a simple followed by a space and then the user's family name). Such a simple
user name cannot be defined as an instance of the IdentifierClass or user name cannot be defined as an instance of the IdentifierClass or
a profile thereof, since space characters are not allowed in the a profile thereof, since space characters are not allowed in the
IdentifierClass; however, it could be defined using a space-separated IdentifierClass; however, it could be defined using a space-separated
sequence of IdentifierClass instances, as in the following pseudo- sequence of IdentifierClass instances, as in the following pseudo-
ABNF [RFC5234]: ABNF [RFC5234]:
fullname = namepart [1*(1*SP namepart)] fullname = namepart *(1*SP namepart)
namepart = 1*(idpoint) namepart = 1*idpoint
; ;
; an "idpoint" is a UTF-8 encoded Unicode code point ; an "idpoint" is a UTF-8 encoded Unicode code point
; that conforms to the PRECIS IdentifierClass ; that conforms to the PRECIS IdentifierClass
Similar techniques could be used to define many application-layer Similar techniques could be used to define many application-layer
constructs, say of the form "user@domain" or "/path/to/file". constructs, say of the form "user@domain" or "/path/to/file".
4.3. A Note about Spaces 4.3. A Note about Spaces
With regard to the IdentiferClass, the consensus of the PRECIS With regard to the IdentiferClass, the consensus of the PRECIS
skipping to change at page 13, line 33 skipping to change at page 14, line 18
(U+0020), space characters are often not rendered in user (U+0020), space characters are often not rendered in user
interfaces, leading to the possibility that a human user might interfaces, leading to the possibility that a human user might
consider a string containing spaces to be equivalent to the same consider a string containing spaces to be equivalent to the same
string without spaces. string without spaces.
o In some locales, some devices are known to generate a character o In some locales, some devices are known to generate a character
other than ASCII space (such as ZERO WIDTH JOINER, U+200D) when a other than ASCII space (such as ZERO WIDTH JOINER, U+200D) when a
user performs an action like hit the space bar on a keyboard. user performs an action like hit the space bar on a keyboard.
One consequence of disallowing space characters in the One consequence of disallowing space characters in the
IdentifierClass might be to effectively discourage the use of ASCII IdentifierClass might be to effectively discourage their use within
space (or, even more problematically, non-ASCII space characters) identifiers created in newer application protocols; given the
within identifiers created in newer application protocols; given the challenges involved in properly handling space characters (especially
challenges involved in properly handling space characters in non-ASCII space characters) in identifiers and other protocol
identifiers and other protocol strings, the Working Group considered strings, the Working Group considered this to be a feature, not a
this to be a feature, not a bug. bug.
However, the FreeformClass does allow spaces, which enables However, the FreeformClass does allow spaces, which enables
application protocols to define profiles of the FreeformClass that application protocols to define profiles of the FreeformClass that
are more flexible than any profiles of the IdentifierClass. In are more flexible than any profiles of the IdentifierClass. In
addition, as explained in the previous section, application protocols addition, as explained in the previous section, application protocols
can also define application-layer constructs containing spaces. can also define application-layer constructs containing spaces.
5. Order of Operations 5. Order of Operations
To ensure proper comparison, the following order of operations is To ensure proper comparison, the following order of operations is
skipping to change at page 16, line 19 skipping to change at page 16, line 49
1. Characters are placed in one or more character categories either 1. Characters are placed in one or more character categories either
(1) based on core properties defined by the Unicode Standard or (1) based on core properties defined by the Unicode Standard or
(2) by treating the code point as an exception and addressing the (2) by treating the code point as an exception and addressing the
code point based on its code point value. These categories are code point based on its code point value. These categories are
not mutually exclusive. not mutually exclusive.
2. Set operations are used with these categories to determine the 2. Set operations are used with these categories to determine the
values for a property specific to a given string class. These values for a property specific to a given string class. These
operations are specified under Section 8. operations are specified under Section 8.
Informational Note: Unicode property names and property value Note: Unicode property names and property value names might have
names might have short abbreviations, such as "gc" for the short abbreviations, such as "gc" for the General_Category
General_Category property and "Ll" for the Lowercase_Letter property and "Ll" for the Lowercase_Letter property value of the
property value of the gc property. gc property.
In the following specification of character categories, the operation In the following specification of character categories, the operation
that returns the value of a particular Unicode character property for that returns the value of a particular Unicode character property for
a code point is designated by using the formal name of that property a code point is designated by using the formal name of that property
(from the Unicode PropertyAliases.txt [1]) followed by '(cp)' for (from the Unicode PropertyAliases.txt [1]) followed by '(cp)' for
"code point". For example, the value of the General_Category "code point". For example, the value of the General_Category
property for a code point is indicated by General_Category(cp). property for a code point is indicated by General_Category(cp).
The first ten categories (A-J) shown below were previously defined The first ten categories (A-J) shown below were previously defined
for IDNA2008 and are copied directly from [RFC5892]. Some of these for IDNA2008 and are copied directly from [RFC5892]. Some of these
skipping to change at page 22, line 26 skipping to change at page 23, line 4
o PVALID o PVALID
o ID_PVAL o ID_PVAL
o FREE_PVAL o FREE_PVAL
o CONTEXTJ o CONTEXTJ
o CONTEXTO o CONTEXTO
o DISALLOWED o DISALLOWED
o ID_DIS o ID_DIS
o FREE_DIS o FREE_DIS
o UNASSIGNED o UNASSIGNED
Informational Note: The value of the derived property calculated Note: The value of the derived property calculated can depend on
can depend on the string class; for example, if an identifier used the string class; for example, if an identifier used in an
in an application protocol is defined as profiling the PRECIS application protocol is defined as profiling the PRECIS
IdentifierClass then a space character such as U+0020 would be IdentifierClass then a space character such as U+0020 would be
assigned to ID_DIS, whereas if an identifier is defined as assigned to ID_DIS, whereas if an identifier is defined as
profiling the PRECIS FreeformClass then the character would be profiling the PRECIS FreeformClass then the character would be
assigned to FREE_PVAL. For the sake of brevity, the designation assigned to FREE_PVAL. For the sake of brevity, the designation
"FREE_PVAL" is used in the code point tables, instead of the "FREE_PVAL" is used in the code point tables, instead of the
longer designation "ID_DIS or FREE_PVAL". In practice, the longer designation "ID_DIS or FREE_PVAL". In practice, the
derived properties ID_PVAL and FREE_DIS are not used in this derived properties ID_PVAL and FREE_DIS are not used in this
specification, since every ID_PVAL code point is PVALID and every specification, since every ID_PVAL code point is PVALID and every
FREE_DIS code point is DISALLOWED. FREE_DIS code point is DISALLOWED.
skipping to change at page 24, line 15 skipping to change at page 24, line 42
intended use, e.g., "A sequence of letters, numbers, and symbols intended use, e.g., "A sequence of letters, numbers, and symbols
that is used to identify or address a network entity."] that is used to identify or address a network entity."]
Specification: [the RFC number] Specification: [the RFC number]
The initial registrations are as follows: The initial registrations are as follows:
Base Class: FreeformClass. Base Class: FreeformClass.
Description: A sequence of letters, numbers, symbols, spaces, and Description: A sequence of letters, numbers, symbols, spaces, and
other code points that is used for free-form strings. other code points that is used for free-form strings.
Specification: RFC XXXX. [Note to RFC Editor: please change XXXX to Specification: Section 3.3 of this document.
the number issued for this specification.] [Note to RFC Editor: please change "this document"
to the RFC number issued for this specification.]
Base Class: IdentifierClass. Base Class: IdentifierClass.
Description: A sequence of letters, numbers, and symbols that is Description: A sequence of letters, numbers, and symbols that is
used to identify or address a network entity. used to identify or address a network entity.
Specification: RFC XXXX. [Note to RFC Editor: please change XXXX to Specification: Section 3.3 of this document.
the number issued for this specification.] [Note to RFC Editor: please change "this document"
to the RFC number issued for this specification.]
9.3. PRECIS Profiles Registry 9.3. PRECIS Profiles Registry
IANA is requested to create a registry of profiles that use the IANA is requested to create a registry of profiles that use the
PRECIS string classes. In accordance with [RFC5226], the PRECIS string classes. In accordance with [RFC5226], the
registration policy is "Expert Review". This policy was chosen in registration policy is "Expert Review". This policy was chosen in
order to ease the burden of registration while ensuring that order to ease the burden of registration while ensuring that
"customers" of PRECIS receive appropriate guidance regarding the "customers" of PRECIS receive appropriate guidance regarding the
sometimes complex and subtle internationalization issues related to sometimes complex and subtle internationalization issues related to
profiles of PRECIS string classes. profiles of PRECIS string classes.
skipping to change at page 30, line 9 skipping to change at page 30, line 39
reuse technologies that themselves process passwords (one example of reuse technologies that themselves process passwords (one example of
such a technology is the Simple Authentication and Security Layer such a technology is the Simple Authentication and Security Layer
[RFC4422]). Moreover, passwords are often carried by a sequence of [RFC4422]). Moreover, passwords are often carried by a sequence of
protocols with backend authentication systems or data storage systems protocols with backend authentication systems or data storage systems
such as RADIUS [RFC2865] and LDAP [RFC4510]. Developers of such as RADIUS [RFC2865] and LDAP [RFC4510]. Developers of
application protocols are encouraged to look into reusing these application protocols are encouraged to look into reusing these
profiles instead of defining new ones, so that end-user expectations profiles instead of defining new ones, so that end-user expectations
about passwords are consistent no matter which application protocol about passwords are consistent no matter which application protocol
is used. is used.
Further discussion of password handling can be found in
[I-D.ietf-precis-saslprepbis].
11. Interoperability Considerations 11. Interoperability Considerations
Although strings that are consumed in PRECIS-based application Although strings that are consumed in PRECIS-based application
protocols are often encoded using UTF-8 [RFC3629], the exact encoding protocols are often encoded using UTF-8 [RFC3629], the exact encoding
is a matter for the application protocol that uses PRECIS, not for is a matter for the application protocol that uses PRECIS, not for
the PRECIS framework. the PRECIS framework.
It is known that some existing systems are unable to support the full It is known that some existing systems are unable to support the full
Unicode character set, or even any characters outside the ASCII Unicode character set, or even any characters outside the ASCII
range. If two (or more) applications need to interoperate when range. If two (or more) applications need to interoperate when
skipping to change at page 31, line 17 skipping to change at page 31, line 46
classes", draft-ietf-precis-mappings-07 (work in classes", draft-ietf-precis-mappings-07 (work in
progress), February 2014. progress), February 2014.
[I-D.ietf-precis-nickname] [I-D.ietf-precis-nickname]
Saint-Andre, P., "Preparation and Comparison of Saint-Andre, P., "Preparation and Comparison of
Nicknames", draft-ietf-precis-nickname-09 (work in Nicknames", draft-ietf-precis-nickname-09 (work in
progress), January 2014. progress), January 2014.
[I-D.ietf-precis-saslprepbis] [I-D.ietf-precis-saslprepbis]
Saint-Andre, P. and A. Melnikov, "Username and Password Saint-Andre, P. and A. Melnikov, "Username and Password
Preparation Algorithms", draft-ietf-precis-saslprepbis-06 Preparation Algorithms", draft-ietf-precis-saslprepbis-07
(work in progress), December 2013. (work in progress), March 2014.
[I-D.ietf-xmpp-6122bis] [I-D.ietf-xmpp-6122bis]
Saint-Andre, P., "Extensible Messaging and Presence Saint-Andre, P., "Extensible Messaging and Presence
Protocol (XMPP): Address Format", draft-ietf-xmpp- Protocol (XMPP): Address Format", draft-ietf-xmpp-
6122bis-11 (work in progress), February 2014. 6122bis-12 (work in progress), March 2014.
[RFC2865] Rigney, C., Willens, S., Rubens, A., and W. Simpson, [RFC2865] Rigney, C., Willens, S., Rubens, A., and W. Simpson,
"Remote Authentication Dial In User Service (RADIUS)", RFC "Remote Authentication Dial In User Service (RADIUS)", RFC
2865, June 2000. 2865, June 2000.
[RFC3454] Hoffman, P. and M. Blanchet, "Preparation of [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of
Internationalized Strings ("stringprep")", RFC 3454, Internationalized Strings ("stringprep")", RFC 3454,
December 2002. December 2002.
[RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello,
skipping to change at page 64, line 8 skipping to change at page 64, line 35
E0002..E001F; UNASSIGNED # <reserved>..<reserved> E0002..E001F; UNASSIGNED # <reserved>..<reserved>
E0020..E007F; DISALLOWED # TAG SPACE..CANCEL TAG E0020..E007F; DISALLOWED # TAG SPACE..CANCEL TAG
E0080..E00FF; UNASSIGNED # <reserved>..<reserved> E0080..E00FF; UNASSIGNED # <reserved>..<reserved>
E0100..E01EF; DISALLOWED # VAR SEL-17..VAR SEL-256 E0100..E01EF; DISALLOWED # VAR SEL-17..VAR SEL-256
E01F0..EFFFD; UNASSIGNED # <reserved>..<reserved> E01F0..EFFFD; UNASSIGNED # <reserved>..<reserved>
EFFFE..10FFFF; DISALLOWED # <noncharacter>..<noncharacter> EFFFE..10FFFF; DISALLOWED # <noncharacter>..<noncharacter>
Appendix B. Acknowledgements Appendix B. Acknowledgements
The authors would like to acknowledge the comments and contributions The authors would like to acknowledge the comments and contributions
of the following individuals: David Black, Mark Davis, Alan DeKok, of the following individuals during working group discussion: David
Martin Duerst, Patrik Faltstrom, Ted Hardie, Joe Hildebrand, Bjoern Black, Mark Davis, Alan DeKok, Martin Duerst, Patrik Faltstrom, Ted
Hoehrmann, Paul Hoffman, Jeffrey Hutzelman, Simon Josefsson, John Hardie, Joe Hildebrand, Bjoern Hoehrmann, Paul Hoffman, Jeffrey
Klensin, Alexey Melnikov, Takahiro Nemoto, Yoav Nir, Mike Parker, Hutzelman, Simon Josefsson, John Klensin, Alexey Melnikov, Takahiro
Pete Resnick, Andrew Sullivan, Dave Thaler, Yoshiro Yoneya, and Nemoto, Yoav Nir, Mike Parker, Pete Resnick, Andrew Sullivan, Dave
Florian Zeitz. Thaler, Yoshiro Yoneya, and Florian Zeitz.
Charlie Kaufman performed a helpful review on behalf of the Security
Directorate, and Tom Taylor reviewed the document on behalf of the
General Area Review Team.
During IESG review, Alissa Cooper provided comments that led to
further improvements.
Some algorithms and textual descriptions have been borrowed from Some algorithms and textual descriptions have been borrowed from
[RFC5892]. Some text regarding security has been borrowed from [RFC5892]. Some text regarding security has been borrowed from
[RFC5890] and [I-D.ietf-xmpp-6122bis]. [RFC5890] and [I-D.ietf-xmpp-6122bis].
Peter Saint-Andre wishes to acknowledge Cisco Systems, Inc., for Peter Saint-Andre wishes to acknowledge Cisco Systems, Inc., for
employing him during his work on earlier versions of this document. employing him during his work on earlier versions of this document.
Authors' Addresses Authors' Addresses
 End of changes. 36 change blocks. 
105 lines changed or deleted 141 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/