draft-ietf-precis-framework-23.txt   rfc7564.txt 
PRECIS P. Saint-Andre Internet Engineering Task Force (IETF) P. Saint-Andre
Internet-Draft &yet Request for Comments: 7564 &yet
Obsoletes: 3454 (if approved) M. Blanchet Obsoletes: 3454 M. Blanchet
Intended status: Standards Track Viagenie Category: Standards Track Viagenie
Expires: August 23, 2015 February 19, 2015 ISSN: 2070-1721 May 2015
PRECIS Framework: Preparation, Enforcement, and Comparison of PRECIS Framework: Preparation, Enforcement, and Comparison of
Internationalized Strings in Application Protocols Internationalized Strings in Application Protocols
draft-ietf-precis-framework-23
Abstract Abstract
Application protocols using Unicode characters in protocol strings Application protocols using Unicode characters in protocol strings
need to properly handle such strings in order to enforce need to properly handle such strings in order to enforce
internationalization rules for strings placed in various protocol internationalization rules for strings placed in various protocol
slots (such as addresses and identifiers) and to perform valid slots (such as addresses and identifiers) and to perform valid
comparison operations (e.g., for purposes of authentication or comparison operations (e.g., for purposes of authentication or
authorization). This document defines a framework enabling authorization). This document defines a framework enabling
application protocols to perform the preparation, enforcement, and application protocols to perform the preparation, enforcement, and
comparison of internationalized strings ("PRECIS") in a way that comparison of internationalized strings ("PRECIS") in a way that
depends on the properties of Unicode characters and thus is agile depends on the properties of Unicode characters and thus is agile
with respect to versions of Unicode. As a result, this framework with respect to versions of Unicode. As a result, this framework
provides a more sustainable approach to the handling of provides a more sustainable approach to the handling of
internationalized strings than the previous framework, known as internationalized strings than the previous framework, known as
Stringprep (RFC 3454). This document obsoletes RFC 3454. Stringprep (RFC 3454). This document obsoletes RFC 3454.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This is an Internet Standards Track document.
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Engineering Task Force
and may be updated, replaced, or obsoleted by other documents at any (IETF). It represents the consensus of the IETF community. It has
time. It is inappropriate to use Internet-Drafts as reference received public review and has been approved for publication by the
material or to cite them other than as "work in progress." Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 5741.
This Internet-Draft will expire on August 23, 2015. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
http://www.rfc-editor.org/info/rfc7564.
Copyright Notice Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction ....................................................4
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 2. Terminology .....................................................7
3. Preparation, Enforcement, and Comparison . . . . . . . . . . 7 3. Preparation, Enforcement, and Comparison ........................7
4. String Classes . . . . . . . . . . . . . . . . . . . . . . . 7 4. String Classes ..................................................8
4.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 7 4.1. Overview ...................................................8
4.2. IdentifierClass . . . . . . . . . . . . . . . . . . . . . 9 4.2. IdentifierClass ............................................9
4.2.1. Valid . . . . . . . . . . . . . . . . . . . . . . . . 9 4.2.1. Valid ...............................................9
4.2.2. Contextual Rule Required . . . . . . . . . . . . . . 9 4.2.2. Contextual Rule Required ...........................10
4.2.3. Disallowed . . . . . . . . . . . . . . . . . . . . . 10 4.2.3. Disallowed .........................................10
4.2.4. Unassigned . . . . . . . . . . . . . . . . . . . . . 10 4.2.4. Unassigned .........................................11
4.2.5. Examples . . . . . . . . . . . . . . . . . . . . . . 10 4.2.5. Examples ...........................................11
4.3. FreeformClass . . . . . . . . . . . . . . . . . . . . . . 11 4.3. FreeformClass .............................................11
4.3.1. Valid . . . . . . . . . . . . . . . . . . . . . . . . 11 4.3.1. Valid ..............................................11
4.3.2. Contextual Rule Required . . . . . . . . . . . . . . 11 4.3.2. Contextual Rule Required ...........................12
4.3.3. Disallowed . . . . . . . . . . . . . . . . . . . . . 12 4.3.3. Disallowed .........................................12
4.3.4. Unassigned . . . . . . . . . . . . . . . . . . . . . 12 4.3.4. Unassigned .........................................12
4.3.5. Examples . . . . . . . . . . . . . . . . . . . . . . 12 4.3.5. Examples ...........................................12
5. Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . 12 5. Profiles .......................................................13
5.1. Profiles Must Not Be Multiplied Beyond Necessity . . . . 13 5.1. Profiles Must Not Be Multiplied beyond Necessity ..........13
5.2. Rules . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5.2. Rules .....................................................14
5.2.1. Width Mapping Rule . . . . . . . . . . . . . . . . . 13 5.2.1. Width Mapping Rule .................................14
5.2.2. Additional Mapping Rule . . . . . . . . . . . . . . . 14 5.2.2. Additional Mapping Rule ............................14
5.2.3. Case Mapping Rule . . . . . . . . . . . . . . . . . . 14 5.2.3. Case Mapping Rule ..................................14
5.2.4. Normalization Rule . . . . . . . . . . . . . . . . . 15 5.2.4. Normalization Rule .................................15
5.2.5. Directionality Rule . . . . . . . . . . . . . . . . . 15 5.2.5. Directionality Rule ................................15
5.3. A Note about Spaces . . . . . . . . . . . . . . . . . . . 16 5.3. A Note about Spaces .......................................16
6. Applications . . . . . . . . . . . . . . . . . . . . . . . . 17 6. Applications ...................................................17
6.1. How to Use PRECIS in Applications . . . . . . . . . . . . 17 6.1. How to Use PRECIS in Applications .........................17
6.2. Further Excluded Characters . . . . . . . . . . . . . . . 17 6.2. Further Excluded Characters ...............................18
6.3. Building Application-Layer Constructs . . . . . . . . . . 18 6.3. Building Application-Layer Constructs .....................18
7. Order of Operations . . . . . . . . . . . . . . . . . . . . . 19 7. Order of Operations ............................................19
8. Code Point Properties . . . . . . . . . . . . . . . . . . . . 19 8. Code Point Properties ..........................................20
9. Category Definitions Used to Calculate Derived Property . . . 22 9. Category Definitions Used to Calculate Derived Property ........22
9.1. LetterDigits (A) . . . . . . . . . . . . . . . . . . . . 22 9.1. LetterDigits (A) ..........................................23
9.2. Unstable (B) . . . . . . . . . . . . . . . . . . . . . . 22 9.2. Unstable (B) ..............................................23
9.3. IgnorableProperties (C) . . . . . . . . . . . . . . . . . 23 9.3. IgnorableProperties (C) ...................................23
9.4. IgnorableBlocks (D) . . . . . . . . . . . . . . . . . . . 23 9.4. IgnorableBlocks (D) .......................................23
9.5. LDH (E) . . . . . . . . . . . . . . . . . . . . . . . . . 23 9.5. LDH (E) ...................................................23
9.6. Exceptions (F) . . . . . . . . . . . . . . . . . . . . . 23 9.6. Exceptions (F) ............................................23
9.7. BackwardCompatible (G) . . . . . . . . . . . . . . . . . 23 9.7. BackwardCompatible (G) ....................................23
9.8. JoinControl (H) . . . . . . . . . . . . . . . . . . . . . 23 9.8. JoinControl (H) ...........................................24
9.9. OldHangulJamo (I) . . . . . . . . . . . . . . . . . . . . 23 9.9. OldHangulJamo (I) .........................................24
9.10. Unassigned (J) . . . . . . . . . . . . . . . . . . . . . 24 9.10. Unassigned (J) ...........................................24
9.11. ASCII7 (K) . . . . . . . . . . . . . . . . . . . . . . . 24 9.11. ASCII7 (K) ...............................................24
9.12. Controls (L) . . . . . . . . . . . . . . . . . . . . . . 24 9.12. Controls (L) .............................................24
9.13. PrecisIgnorableProperties (M) . . . . . . . . . . . . . . 24 9.13. PrecisIgnorableProperties (M) ............................24
9.14. Spaces (N) . . . . . . . . . . . . . . . . . . . . . . . 24 9.14. Spaces (N) ...............................................25
9.15. Symbols (O) . . . . . . . . . . . . . . . . . . . . . . . 24 9.15. Symbols (O) ..............................................25
9.16. Punctuation (P) . . . . . . . . . . . . . . . . . . . . . 25 9.16. Punctuation (P) ..........................................25
9.17. HasCompat (Q) . . . . . . . . . . . . . . . . . . . . . . 25 9.17. HasCompat (Q) ............................................25
9.18. OtherLetterDigits (R) . . . . . . . . . . . . . . . . . . 25 9.18. OtherLetterDigits (R) ....................................25
10. Guidelines for Designated Experts . . . . . . . . . . . . . . 25 10. Guidelines for Designated Experts .............................26
11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 26 11. IANA Considerations ...........................................27
11.1. PRECIS Derived Property Value Registry . . . . . . . . . 26 11.1. PRECIS Derived Property Value Registry ...................27
11.2. PRECIS Base Classes Registry . . . . . . . . . . . . . . 26 11.2. PRECIS Base Classes Registry .............................27
11.3. PRECIS Profiles Registry . . . . . . . . . . . . . . . . 27 11.3. PRECIS Profiles Registry .................................28
12. Security Considerations . . . . . . . . . . . . . . . . . . . 29 12. Security Considerations .......................................29
12.1. General Issues . . . . . . . . . . . . . . . . . . . . . 29 12.1. General Issues ...........................................29
12.2. Use of the IdentifierClass . . . . . . . . . . . . . . . 30 12.2. Use of the IdentifierClass ...............................30
12.3. Use of the FreeformClass . . . . . . . . . . . . . . . . 30 12.3. Use of the FreeformClass .................................30
12.4. Local Character Set Issues . . . . . . . . . . . . . . . 30 12.4. Local Character Set Issues ...............................31
12.5. Visually Similar Characters . . . . . . . . . . . . . . 30 12.5. Visually Similar Characters ..............................31
12.6. Security of Passwords . . . . . . . . . . . . . . . . . 32 12.6. Security of Passwords ....................................33
13. Interoperability Considerations . . . . . . . . . . . . . . . 33 13. Interoperability Considerations ...............................34
13.1. Encoding . . . . . . . . . . . . . . . . . . . . . . . . 33 13.1. Encoding .................................................34
13.2. Character Sets . . . . . . . . . . . . . . . . . . . . . 33 13.2. Character Sets ...........................................34
13.3. Unicode Versions . . . . . . . . . . . . . . . . . . . . 34 13.3. Unicode Versions .........................................34
13.4. Potential Changes to Handling of Certain Unicode Code 13.4. Potential Changes to Handling of Certain Unicode
Points . . . . . . . . . . . . . . . . . . . . . . . . . 34 Code Points ..............................................34
14. References . . . . . . . . . . . . . . . . . . . . . . . . . 35 14. References ....................................................35
14.1. Normative References . . . . . . . . . . . . . . . . . . 35 14.1. Normative References .....................................35
14.2. Informative References . . . . . . . . . . . . . . . . . 35 14.2. Informative References ...................................36
14.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Acknowledgements ..................................................40
Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 38 Authors' Addresses ................................................40
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 39
1. Introduction 1. Introduction
Application protocols using Unicode characters [Unicode7.0] in Application protocols using Unicode characters [Unicode] in protocol
protocol strings need to properly handle such strings in order to strings need to properly handle such strings in order to enforce
enforce internationalization rules for strings placed in various internationalization rules for strings placed in various protocol
protocol slots (such as addresses and identifiers) and to perform slots (such as addresses and identifiers) and to perform valid
valid comparison operations (e.g., for purposes of authentication or comparison operations (e.g., for purposes of authentication or
authorization). This document defines a framework enabling authorization). This document defines a framework enabling
application protocols to perform the preparation, enforcement, and application protocols to perform the preparation, enforcement, and
comparison of internationalized strings ("PRECIS") in a way that comparison of internationalized strings ("PRECIS") in a way that
depends on the properties of Unicode characters and thus is agile depends on the properties of Unicode characters and thus is agile
with respect to versions of Unicode. with respect to versions of Unicode.
As described in the PRECIS problem statement [RFC6885], many IETF As described in the PRECIS problem statement [RFC6885], many IETF
protocols have used the Stringprep framework [RFC3454] as the basis protocols have used the Stringprep framework [RFC3454] as the basis
for preparing, enforcing, and comparing protocol strings that contain for preparing, enforcing, and comparing protocol strings that contain
Unicode characters, especially characters outside the ASCII range Unicode characters, especially characters outside the ASCII range
skipping to change at page 4, line 51 skipping to change at page 5, line 15
This document defines a framework for a post-Stringprep approach to This document defines a framework for a post-Stringprep approach to
the preparation, enforcement, and comparison of internationalized the preparation, enforcement, and comparison of internationalized
strings in application protocols, based on several principles: strings in application protocols, based on several principles:
1. Define a small set of string classes that specify the Unicode 1. Define a small set of string classes that specify the Unicode
characters (i.e., specific "code points") appropriate for common characters (i.e., specific "code points") appropriate for common
application protocol constructs. application protocol constructs.
2. Define each PRECIS string class in terms of Unicode code points 2. Define each PRECIS string class in terms of Unicode code points
and their properties so that an algorithm can be used to and their properties so that an algorithm can be used to
determine whether each code point or character category is (a) determine whether each code point or character category is
valid, (b) allowed in certain contexts, (c) disallowed, or (d) (a) valid, (b) allowed in certain contexts, (c) disallowed, or
unassigned. (d) unassigned.
3. Use an "inclusion model" such that a string class consists only 3. Use an "inclusion model" such that a string class consists only
of code points that are explicitly allowed, with the result that of code points that are explicitly allowed, with the result that
any code point not explicitly allowed is forbidden. any code point not explicitly allowed is forbidden.
4. Enable application protocols to define profiles of the PRECIS 4. Enable application protocols to define profiles of the PRECIS
string classes if necessary (addressing matters such as width string classes if necessary (addressing matters such as width
mapping, case mapping, Unicode normalization, and directionality) mapping, case mapping, Unicode normalization, and directionality)
but strongly discourage the multiplication of profiles beyond but strongly discourage the multiplication of profiles beyond
necessity in order to avoid violations of the Principle of Least necessity in order to avoid violations of the "Principle of Least
User Astonishment. Astonishment".
It is expected that this framework will yield the following benefits: It is expected that this framework will yield the following benefits:
o Application protocols will be agile with regard to Unicode o Application protocols will be agile with regard to Unicode
versions. versions.
o Implementers will be able to share code point tables and software o Implementers will be able to share code point tables and software
code across application protocols, most likely by means of code across application protocols, most likely by means of
software libraries. software libraries.
o End users will be able to acquire more accurate expectations about o End users will be able to acquire more accurate expectations about
the characters that are acceptable in various contexts. Given the characters that are acceptable in various contexts. Given
this more uniform set of string classes, it is also expected that this more uniform set of string classes, it is also expected that
copy/paste operations between software implementing different copy/paste operations between software implementing different
application protocols will be more predictable and coherent. application protocols will be more predictable and coherent.
Whereas the string classes define the "baseline" code points for a Whereas the string classes define the "baseline" code points for a
range of applications, profiling enables application protocols to range of applications, profiling enables application protocols to
apply the string classes in ways that are appropriate for common apply the string classes in ways that are appropriate for common
constructs such as usernames [I-D.ietf-precis-saslprepbis], opaque constructs such as usernames [PRECIS-Users-Pwds], opaque strings such
strings such as passwords [I-D.ietf-precis-saslprepbis], and as passwords [PRECIS-Users-Pwds], and nicknames [PRECIS-Nickname].
nicknames [I-D.ietf-precis-nickname]. Profiles are responsible for Profiles are responsible for defining the handling of right-to-left
defining the handling of right-to-left characters as well as various characters as well as various mapping operations of the kind also
mapping operations of the kind also discussed for IDNs in [RFC5895], discussed for IDNs in [RFC5895], such as case preservation or
such as case preservation or lowercasing, Unicode normalization, lowercasing, Unicode normalization, mapping of certain characters to
mapping of certain characters to other characters or to nothing, and other characters or to nothing, and mapping of fullwidth and
mapping of full-width and half-width characters. halfwidth characters.
When an application applies a profile of a PRECIS string class, it When an application applies a profile of a PRECIS string class, it
transforms an input string (which might or might not be conforming) transforms an input string (which might or might not be conforming)
into an output string that definitively conforms to the profile. In into an output string that definitively conforms to the profile. In
particular, this document focuses on the resulting ability to achieve particular, this document focuses on the resulting ability to achieve
the following objectives: the following objectives:
a. Enforcing all the the rules of a profile for a single output a. Enforcing all the rules of a profile for a single output string
string (e.g., to determine if a string can be included in a (e.g., to determine if a string can be included in a protocol
protocol slot, communicated to another entity within a protocol, slot, communicated to another entity within a protocol, stored in
stored in a retrieval system, etc.). a retrieval system, etc.).
b. Comparing two output strings to determine if they are equivalent, b. Comparing two output strings to determine if they are equivalent,
typically through octet-for-octet matching to test for "bit- typically through octet-for-octet matching to test for
string identity" (e.g., to make an access decision for purposes "bit-string identity" (e.g., to make an access decision for
of authentication or authorization as further described in purposes of authentication or authorization as further described
[RFC6943]). in [RFC6943]).
The opportunity to define profiles naturally introduces the The opportunity to define profiles naturally introduces the
possibility of a proliferation of profiles, thus potentially possibility of a proliferation of profiles, thus potentially
mitigating the benefits of common code and violating user mitigating the benefits of common code and violating user
expectations. See Section 5 for a discussion of this important expectations. See Section 5 for a discussion of this important
topic. topic.
In addition, it is extremely important for protocol designers and In addition, it is extremely important for protocol designers and
application developers to understand that the transformation of an application developers to understand that the transformation of an
input string to an output string is rarely reversible. As one input string to an output string is rarely reversible. As one
skipping to change at page 6, line 37 skipping to change at page 6, line 46
capitalization of the first and third characters would be lost. capitalization of the first and third characters would be lost.
Similar considerations apply to other forms of mapping and Similar considerations apply to other forms of mapping and
normalization. normalization.
Although this framework is similar to IDNA2008 and includes by Although this framework is similar to IDNA2008 and includes by
reference some of the character categories defined in [RFC5892], it reference some of the character categories defined in [RFC5892], it
defines additional character categories to meet the needs of common defines additional character categories to meet the needs of common
application protocols other than DNS. application protocols other than DNS.
The character categories and calculation rules defined under The character categories and calculation rules defined under
Section 8 and Section 9 are normative and apply to all Unicode code Sections 8 and 9 are normative and apply to all Unicode code points.
points. The code point table that results from applying the The code point table that results from applying the character
character categories and calculation rules to the latest version of categories and calculation rules to the latest version of Unicode can
Unicode can be found in an IANA registry. be found in an IANA registry.
2. Terminology 2. Terminology
Many important terms used in this document are defined in [RFC5890], Many important terms used in this document are defined in [RFC5890],
[RFC6365], [RFC6885], and [Unicode7.0]. The terms "left-to-right" [RFC6365], [RFC6885], and [Unicode]. The terms "left-to-right" (LTR)
(LTR) and "right-to-left" (RTL) are defined in Unicode Standard Annex and "right-to-left" (RTL) are defined in Unicode Standard Annex #9
#9 [UAX9]. [UAX9].
As of the date of writing, the version of Unicode published by the As of the date of writing, the version of Unicode published by the
Unicode Consortium is 7.0 [Unicode7.0]; however, PRECIS is not tied Unicode Consortium is 7.0 [Unicode7.0]; however, PRECIS is not tied
to a specific version of Unicode. The latest version of Unicode is to a specific version of Unicode. The latest version of Unicode is
always available [UnicodeCurrent]. always available [Unicode].
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in
[RFC2119]. [RFC2119].
3. Preparation, Enforcement, and Comparison 3. Preparation, Enforcement, and Comparison
This document distinguishes between three different actions that an This document distinguishes between three different actions that an
entity can take with regard to a string: entity can take with regard to a string:
skipping to change at page 8, line 28 skipping to change at page 8, line 41
FreeformClass: a sequence of letters, numbers, symbols, spaces, and FreeformClass: a sequence of letters, numbers, symbols, spaces, and
other characters that is used for free-form strings, including other characters that is used for free-form strings, including
passwords as well as display elements such as human-friendly passwords as well as display elements such as human-friendly
nicknames for devices or for participants in a chatroom; the nicknames for devices or for participants in a chatroom; the
intent is that this class will allow nearly any Unicode character, intent is that this class will allow nearly any Unicode character,
with the result that expressiveness has been prioritized over with the result that expressiveness has been prioritized over
safety for this class. Note well that protocol designers, safety for this class. Note well that protocol designers,
application developers, service providers, and end users might not application developers, service providers, and end users might not
understand or be able to enter all of the characters that can be understand or be able to enter all of the characters that can be
included in the FreeformClass - see Section 12.3 for details. included in the FreeformClass -- see Section 12.3 for details.
Future specifications might define additional PRECIS string classes, Future specifications might define additional PRECIS string classes,
such as a class that falls somewhere between the IdentifierClass and such as a class that falls somewhere between the IdentifierClass and
the FreeformClass. At this time, it is not clear how useful such a the FreeformClass. At this time, it is not clear how useful such a
class would be. In any case, because application developers are able class would be. In any case, because application developers are able
to define profiles of PRECIS string classes, a protocol needing a to define profiles of PRECIS string classes, a protocol needing a
construct between the IdentiferClass and the FreeformClass could construct between the IdentifierClass and the FreeformClass could
define a restricted profile of the FreeformClass if needed. define a restricted profile of the FreeformClass if needed.
The following subsections discuss the IdentifierClass and The following subsections discuss the IdentifierClass and
FreeformClass in more detail, with reference to the dimensions FreeformClass in more detail, with reference to the dimensions
described in Section 3 of [RFC6885]. Each string class is defined by described in Section 5 of [RFC6885]. Each string class is defined by
the following behavioral rules: the following behavioral rules:
Valid: Defines which code points are treated as valid for the Valid: Defines which code points are treated as valid for the
string. string.
Contextual Rule Required: Defines which code points are treated as Contextual Rule Required: Defines which code points are treated as
allowed only if the requirements of a contextual rule are met allowed only if the requirements of a contextual rule are met
(i.e., either CONTEXTJ or CONTEXTO). (i.e., either CONTEXTJ or CONTEXTO).
Disallowed: Defines which code points need to be excluded from the Disallowed: Defines which code points need to be excluded from the
skipping to change at page 9, line 19 skipping to change at page 9, line 34
This document defines the valid, contextual rule required, This document defines the valid, contextual rule required,
disallowed, and unassigned rules for the IdentifierClass and disallowed, and unassigned rules for the IdentifierClass and
FreeformClass. As described under Section 5, profiles of these FreeformClass. As described under Section 5, profiles of these
string classes are responsible for defining the width mapping, string classes are responsible for defining the width mapping,
additional mappings, case mapping, normalization, and directionality additional mappings, case mapping, normalization, and directionality
rules. rules.
4.2. IdentifierClass 4.2. IdentifierClass
Most application technologies need strings that can be used to refer Most application technologies need strings that can be used to refer
to, include, or communicate protocol strings like usernames, file to, include, or communicate protocol strings like usernames,
names, data feed identifiers, and chatroom names. We group such filenames, data feed identifiers, and chatroom names. We group such
strings into a class called "IdentifierClass" having the following strings into a class called "IdentifierClass" having the following
features. features.
4.2.1. Valid 4.2.1. Valid
o Code points traditionally used as letters and numbers in writing o Code points traditionally used as letters and numbers in writing
systems, i.e., the LetterDigits ("A") category first defined in systems, i.e., the LetterDigits ("A") category first defined in
[RFC5892] and listed here under Section 9.1. [RFC5892] and listed here under Section 9.1.
o Code points in the range U+0021 through U+007E, i.e., the o Code points in the range U+0021 through U+007E, i.e., the
(printable) ASCII7 ("K") rule defined under Section 9.11. These (printable) ASCII7 ("K") category defined under Section 9.11.
code points are "grandfathered" into PRECIS and thus are valid These code points are "grandfathered" into PRECIS and thus are
even if they would otherwise be disallowed according to the valid even if they would otherwise be disallowed according to the
property-based rules specified in the next section. property-based rules specified in the next section.
Note: Although the PRECIS IdentifierClass re-uses the LetterDigits Note: Although the PRECIS IdentifierClass reuses the LetterDigits
category from IDNA2008, the range of characters allowed in the category from IDNA2008, the range of characters allowed in the
IdentifierClass is wider than the range of characters allowed in IdentifierClass is wider than the range of characters allowed in
IDNA2008. The main reason is that IDNA2008 applies the Unstable IDNA2008. The main reason is that IDNA2008 applies the Unstable
category before the LetterDigits category, thus disallowing category before the LetterDigits category, thus disallowing
uppercase characters, whereas the IdentifierClass does not apply uppercase characters, whereas the IdentifierClass does not apply
the Unstable category. the Unstable category.
4.2.2. Contextual Rule Required 4.2.2. Contextual Rule Required
o A number of characters from the Exceptions ("F") category defined o A number of characters from the Exceptions ("F") category defined
skipping to change at page 10, line 38 skipping to change at page 11, line 8
according to the property-based rules specified in the previous according to the property-based rules specified in the previous
section. section.
o Letters and digits other than the "traditional" letters and digits o Letters and digits other than the "traditional" letters and digits
allowed in IDNs, i.e., the OtherLetterDigits ("R") category allowed in IDNs, i.e., the OtherLetterDigits ("R") category
defined under Section 9.18. defined under Section 9.18.
4.2.4. Unassigned 4.2.4. Unassigned
Any code points that are not yet designated in the Unicode character Any code points that are not yet designated in the Unicode character
set are considered Unassigned for purposes of the IdentifierClass, set are considered unassigned for purposes of the IdentifierClass,
and such code points are to be treated as Disallowed. See and such code points are to be treated as disallowed. See
Section 9.10. Section 9.10.
4.2.5. Examples 4.2.5. Examples
As described in the Introduction to this document, the string classes As described in the Introduction to this document, the string classes
do not handle all issues related to string preparation and comparison do not handle all issues related to string preparation and comparison
(such as case mapping); instead, such issues are handled at the level (such as case mapping); instead, such issues are handled at the level
of profiles. Examples for two profiles of the IdentifierClass can be of profiles. Examples for profiles of the IdentifierClass can be
found in [I-D.ietf-precis-saslprepbis] (the UsernameIdentifierClass found in [PRECIS-Users-Pwds] (the UsernameCaseMapped and
profile) and in [I-D.ietf-xmpp-6122bis] (the LocalpartIdentifierClass UsernameCasePreserved profiles).
profile).
4.3. FreeformClass 4.3. FreeformClass
Some application technologies need strings that can be used in a Some application technologies need strings that can be used in a
free-form way, e.g., as a password in an authentication exchange (see free-form way, e.g., as a password in an authentication exchange (see
[I-D.ietf-precis-saslprepbis]) or a nickname in a chatroom (see [PRECIS-Users-Pwds]) or a nickname in a chatroom (see
[I-D.ietf-precis-nickname]). We group such things into a class [PRECIS-Nickname]). We group such things into a class called
called "FreeformClass" having the following features. "FreeformClass" having the following features.
Security Warning: As mentioned, the FreeformClass prioritizes Security Warning: As mentioned, the FreeformClass prioritizes
expressiveness over safety; Section 12.3 describes some of the expressiveness over safety; Section 12.3 describes some of the
security hazards involved with using or profiling the security hazards involved with using or profiling the
FreeformClass. FreeformClass.
Security Warning: Consult Section 12.6 for relevant security Security Warning: Consult Section 12.6 for relevant security
considerations when strings conforming to the FreeformClass, or a considerations when strings conforming to the FreeformClass, or a
profile thereof, are used as passwords. profile thereof, are used as passwords.
skipping to change at page 11, line 33 skipping to change at page 11, line 49
o Traditional letters and numbers, i.e., the LetterDigits ("A") o Traditional letters and numbers, i.e., the LetterDigits ("A")
category first defined in [RFC5892] and listed here under category first defined in [RFC5892] and listed here under
Section 9.1. Section 9.1.
o Letters and digits other than the "traditional" letters and digits o Letters and digits other than the "traditional" letters and digits
allowed in IDNs, i.e., the OtherLetterDigits ("R") category allowed in IDNs, i.e., the OtherLetterDigits ("R") category
defined under Section 9.18. defined under Section 9.18.
o Code points in the range U+0021 through U+007E, i.e., the o Code points in the range U+0021 through U+007E, i.e., the
(printable) ASCII7 ("K") rule defined under Section 9.11. (printable) ASCII7 ("K") category defined under Section 9.11.
o Any character that has a compatibility equivalent, i.e., the o Any character that has a compatibility equivalent, i.e., the
HasCompat ("Q") category defined under Section 9.17. HasCompat ("Q") category defined under Section 9.17.
o Space characters, i.e., the Spaces ("N") category defined under o Space characters, i.e., the Spaces ("N") category defined under
Section 9.14. Section 9.14.
o Symbol characters, i.e., the Symbols ("O") category defined under o Symbol characters, i.e., the Symbols ("O") category defined under
Section 9.15. Section 9.15.
skipping to change at page 12, line 22 skipping to change at page 12, line 36
o Control characters, i.e., the Controls ("L") category defined o Control characters, i.e., the Controls ("L") category defined
under Section 9.12. under Section 9.12.
o Ignorable characters, i.e., the PrecisIgnorableProperties ("M") o Ignorable characters, i.e., the PrecisIgnorableProperties ("M")
category defined under Section 9.13. category defined under Section 9.13.
4.3.4. Unassigned 4.3.4. Unassigned
Any code points that are not yet designated in the Unicode character Any code points that are not yet designated in the Unicode character
set are considered Unassigned for purposes of the FreeformClass, and set are considered unassigned for purposes of the FreeformClass, and
such code points are to be treated as Disallowed. such code points are to be treated as disallowed.
4.3.5. Examples 4.3.5. Examples
As described in the Introduction to this document, the string classes As described in the Introduction to this document, the string classes
do not handle all issues related to string preparation and comparison do not handle all issues related to string preparation and comparison
(such as case mapping); instead, such issues are handled at the level (such as case mapping); instead, such issues are handled at the level
of profiles. Examples for two profiles of the FreeformClass can be of profiles. Examples for profiles of the FreeformClass can be found
found in [I-D.ietf-precis-nickname] (the NicknameFreeformClass in [PRECIS-Users-Pwds] (the OpaqueString profile) and
profile) and in [I-D.ietf-xmpp-6122bis] (the [PRECIS-Nickname] (the Nickname profile).
ResourcepartIdentifierClass profile).
5. Profiles 5. Profiles
This framework document defines the valid, contextual-rule-required, This framework document defines the valid, contextual-rule-required,
disallowed, and unassigned rules for the IdentifierClass and the disallowed, and unassigned rules for the IdentifierClass and the
FreeformClass. A profile of a PRECIS string class MUST define the FreeformClass. A profile of a PRECIS string class MUST define the
width mapping, additional mappings (if any), case mapping, width mapping, additional mappings (if any), case mapping,
normalization, and directionality rules. A profile MAY also restrict normalization, and directionality rules. A profile MAY also restrict
the allowable characters above and beyond the definition of the the allowable characters above and beyond the definition of the
relevant PRECIS string class (but MUST NOT add as valid any code relevant PRECIS string class (but MUST NOT add as valid any code
points that are disallowed by the relevant PRECIS string class). points that are disallowed by the relevant PRECIS string class).
These matters are discussed in the following subsections. These matters are discussed in the following subsections.
Profiles of the PRECIS string classes are registered with the IANA as Profiles of the PRECIS string classes are registered with the IANA as
described under Section 11.3. Profile names use the following described under Section 11.3. Profile names use the following
convention: they are of the form "Profilename of BaseClass", where convention: they are of the form "Profilename of BaseClass", where
the "Profilename" string is a differentiator and "BaseClass" is the the "Profilename" string is a differentiator and "BaseClass" is the
name of the PRECIS string class being profiled; for example, the name of the PRECIS string class being profiled; for example, the
profile of the Freeform used for opaque strings such as passwords is profile of the FreeformClass used for opaque strings such as
the "OpaqueString" profile [I-D.ietf-precis-saslprepbis]. passwords is the OpaqueString profile [PRECIS-Users-Pwds].
5.1. Profiles Must Not Be Multiplied Beyond Necessity 5.1. Profiles Must Not Be Multiplied beyond Necessity
The risk of profile proliferation is significant because having too The risk of profile proliferation is significant because having too
many profiles will result in different behavior across various many profiles will result in different behavior across various
applications, thus violating what is known in user interface design applications, thus violating what is known in user interface design
as the Principle of Least Astonishment. as the "Principle of Least Astonishment".
Indeed, we already have too many profiles. Ideally we would have at Indeed, we already have too many profiles. Ideally we would have at
most two or three profiles. Unfortunately, numerous application most two or three profiles. Unfortunately, numerous application
protocols exist with their own quirks regarding protocol strings. protocols exist with their own quirks regarding protocol strings.
Domain names, email addresses, instant messaging addresses, chatroom Domain names, email addresses, instant messaging addresses, chatroom
nicknames, filenames, authentication identifiers, passwords, and nicknames, filenames, authentication identifiers, passwords, and
other strings are already out there in the wild and need to be other strings are already out there in the wild and need to be
supported in existing application protocols such as DNS, SMTP, XMPP, supported in existing application protocols such as DNS, SMTP, the
IRC, NFS, iSCSI, EAP, and SASL among others. Extensible Messaging and Presence Protocol (XMPP), Internet Relay
Chat (IRC), NFS, the Internet Small Computer System Interface
(iSCSI), the Extensible Authentication Protocol (EAP), and the Simple
Authentication and Security Layer (SASL), among others.
Nevertheless, profiles must not be multiplied beyond necessity. Nevertheless, profiles must not be multiplied beyond necessity.
To help prevent profile proliferation, this document recommends To help prevent profile proliferation, this document recommends
sensible defaults for the various options offered to profile creators sensible defaults for the various options offered to profile creators
(such as width mapping and Unicode normalization). In addition, the (such as width mapping and Unicode normalization). In addition, the
guidelines for designated experts provided under Section 10 are meant guidelines for designated experts provided under Section 10 are meant
to encourage a high level of due diligence regarding new profiles. to encourage a high level of due diligence regarding new profiles.
5.2. Rules 5.2. Rules
5.2.1. Width Mapping Rule 5.2.1. Width Mapping Rule
The width mapping rule of a profile specifies whether width mapping The width mapping rule of a profile specifies whether width mapping
is performed on the characters of a string, and how the mapping is is performed on the characters of a string, and how the mapping is
done. Typically such mapping consists of mapping fullwidth and done. Typically, such mapping consists of mapping fullwidth and
halfwidth characters, i.e., code points with a Decomposition Type of halfwidth characters, i.e., code points with a Decomposition Type of
Wide or Narrow, to their decomposition mappings; as an example, Wide or Narrow, to their decomposition mappings; as an example,
FULLWIDTH DIGIT ZERO (U+FF10) would be mapped to DIGIT ZERO (U+0030). FULLWIDTH DIGIT ZERO (U+FF10) would be mapped to DIGIT ZERO (U+0030).
The normalization form specified by a profile (see below) has an The normalization form specified by a profile (see below) has an
impact on the need for width mapping. Because width mapping is impact on the need for width mapping. Because width mapping is
performed as a part of compatibility decomposition, a profile performed as a part of compatibility decomposition, a profile
employing either normalization form KD (NFKD) or normalization form employing either normalization form KD (NFKD) or normalization form
KC (NFKC) does not need to specify width mapping. However, if KC (NFKC) does not need to specify width mapping. However, if
Unicode normalization form C (NFC) is used (as is recommended) then Unicode normalization form C (NFC) is used (as is recommended) then
the profile needs to specify whether to apply width mapping; in this the profile needs to specify whether to apply width mapping; in this
case, width mapping is in general RECOMMENDED because allowing case, width mapping is in general RECOMMENDED because allowing
fullwidth and halfwidth characters to remain unmapped to their fullwidth and halfwidth characters to remain unmapped to their
compatibility variants would violate the Principle of Least compatibility variants would violate the "Principle of Least
Astonishment. For more information about the concept of width in Astonishment". For more information about the concept of width in
East Asian scripts within Unicode, see Unicode Standard Annex #11 East Asian scripts within Unicode, see Unicode Standard Annex #11
[UAX11]. [UAX11].
5.2.2. Additional Mapping Rule 5.2.2. Additional Mapping Rule
The additional mapping rule of a profile specifies whether additional The additional mapping rule of a profile specifies whether additional
mappings is performed on the characters of a string, such as: mappings are performed on the characters of a string, such as:
Mapping of delimiter characters (such as '@', ':', '/', '+', and Mapping of delimiter characters (such as '@', ':', '/', '+',
'-') and '-')
Mapping of special characters (e.g., non-ASCII space characters to Mapping of special characters (e.g., non-ASCII space characters to
ASCII space or control characters to nothing). ASCII space or control characters to nothing).
The PRECIS mappings document [I-D.ietf-precis-mappings] describes The PRECIS mappings document [PRECIS-Mappings] describes such
such mappings in more detail. mappings in more detail.
5.2.3. Case Mapping Rule 5.2.3. Case Mapping Rule
The case mapping rule of a profile specifies whether case mapping The case mapping rule of a profile specifies whether case mapping
(instead of case preservation) is performed on the characters of a (instead of case preservation) is performed on the characters of a
string, and how the mapping is applied (e.g., mapping uppercase and string, and how the mapping is applied (e.g., mapping uppercase and
titlecase characters to their lowercase equivalents). titlecase characters to their lowercase equivalents).
If case mapping is desired (instead of case preservation), it is If case mapping is desired (instead of case preservation), it is
RECOMMENDED to use Unicode Default Case Folding as defined in Chapter RECOMMENDED to use Unicode Default Case Folding as defined in the
3 of the Unicode Standard [Unicode7.0]. Unicode Standard [Unicode] (at the time of this writing, the
algorithm is specified in Chapter 3 of [Unicode7.0]).
Note: Unicode Default Case Folding is not designed to handle Note: Unicode Default Case Folding is not designed to handle
various localization issues (such as so-called "dotless i" in various localization issues (such as so-called "dotless i" in
several Turkic languages). The PRECIS mappings document several Turkic languages). The PRECIS mappings document
[I-D.ietf-precis-mappings] describes these issues in greater [PRECIS-Mappings] describes these issues in greater detail and
detail and defines a "local case mapping" method that handles some defines a "local case mapping" method that handles some locale-
locale-dependent and context-dependent mappings. dependent and context-dependent mappings.
In order to maximize entropy and minimize the potential for false In order to maximize entropy and minimize the potential for false
positives, it is NOT RECOMMENDED for application protocols to map positives, it is NOT RECOMMENDED for application protocols to map
uppercase and titlecase code points to their lowercase equivalents uppercase and titlecase code points to their lowercase equivalents
when strings conforming to the FreeformClass, or a profile thereof, when strings conforming to the FreeformClass, or a profile thereof,
are used in passwords; instead, it is RECOMMENDED to preserve the are used in passwords; instead, it is RECOMMENDED to preserve the
case of all code points contained in such strings and then perform case of all code points contained in such strings and then perform
case-sensitive comparison. See also the related discussion in case-sensitive comparison. See also the related discussion in
[I-D.ietf-precis-saslprepbis]. Section 12.6 and in [PRECIS-Users-Pwds].
5.2.4. Normalization Rule 5.2.4. Normalization Rule
The normalization rule of a profile specifies which Unicode The normalization rule of a profile specifies which Unicode
normalization form (D, KD, C, or KC) is to be applied (see Unicode normalization form (D, KD, C, or KC) is to be applied (see Unicode
Standard Annex #15 [UAX15] for background information). Standard Annex #15 [UAX15] for background information).
In accordance with [RFC5198], normalization form C (NFC) is In accordance with [RFC5198], normalization form C (NFC) is
RECOMMENDED. RECOMMENDED.
skipping to change at page 16, line 19 skipping to change at page 16, line 40
research into the challenges of displaying bidirectional strings. research into the challenges of displaying bidirectional strings.
This document strongly suggests that profile authors who are thinking This document strongly suggests that profile authors who are thinking
about defining a new directionality rule think again, and instead about defining a new directionality rule think again, and instead
consider using the "Bidi Rule" [RFC5893] (for profiles based on the consider using the "Bidi Rule" [RFC5893] (for profiles based on the
IdentifierClass) or following the Unicode bidirectional algorithm IdentifierClass) or following the Unicode bidirectional algorithm
[UAX9] (for profiles based on the FreeformClass or in situations [UAX9] (for profiles based on the FreeformClass or in situations
where the IdentifierClass is not appropriate). where the IdentifierClass is not appropriate).
5.3. A Note about Spaces 5.3. A Note about Spaces
With regard to the IdentiferClass, the consensus of the PRECIS With regard to the IdentifierClass, the consensus of the PRECIS
Working Group was that spaces are problematic for many reasons, Working Group was that spaces are problematic for many reasons,
including: including the following:
o Many Unicode characters are confusable with ASCII space. o Many Unicode characters are confusable with ASCII space.
o Even if non-ASCII space characters are mapped to ASCII space o Even if non-ASCII space characters are mapped to ASCII space
(U+0020), space characters are often not rendered in user (U+0020), space characters are often not rendered in user
interfaces, leading to the possibility that a human user might interfaces, leading to the possibility that a human user might
consider a string containing spaces to be equivalent to the same consider a string containing spaces to be equivalent to the same
string without spaces. string without spaces.
o In some locales, some devices are known to generate a character o In some locales, some devices are known to generate a character
skipping to change at page 16, line 46 skipping to change at page 17, line 20
IdentifierClass might be to effectively discourage their use within IdentifierClass might be to effectively discourage their use within
identifiers created in newer application protocols; given the identifiers created in newer application protocols; given the
challenges involved with properly handling space characters challenges involved with properly handling space characters
(especially non-ASCII space characters) in identifiers and other (especially non-ASCII space characters) in identifiers and other
protocol strings, the PRECIS Working Group considered this to be a protocol strings, the PRECIS Working Group considered this to be a
feature, not a bug. feature, not a bug.
However, the FreeformClass does allow spaces, which enables However, the FreeformClass does allow spaces, which enables
application protocols to define profiles of the FreeformClass that application protocols to define profiles of the FreeformClass that
are more flexible than any profiles of the IdentifierClass. In are more flexible than any profiles of the IdentifierClass. In
addition, as explained in the previous section, application protocols addition, as explained in Section 6.3, application protocols can also
can also define application-layer constructs containing spaces. define application-layer constructs containing spaces.
6. Applications 6. Applications
6.1. How to Use PRECIS in Applications 6.1. How to Use PRECIS in Applications
Although PRECIS has been designed with applications in mind, Although PRECIS has been designed with applications in mind,
internationalization is not suddenly made easy though the use of internationalization is not suddenly made easy through the use of
PRECIS. Application developers still need to give some thought to PRECIS. Application developers still need to give some thought to
how they will use the PRECIS string classes, or profiles thereof, in how they will use the PRECIS string classes, or profiles thereof, in
their applications. This section provides some guidelines to their applications. This section provides some guidelines to
application developers (and to expert reviewers of application application developers (and to expert reviewers of application
protocol specifications). protocol specifications).
o Don't define your own profile unless absolutely necessary (see o Don't define your own profile unless absolutely necessary (see
Section 5.1). Existing profiles have been design for wide re-use. Section 5.1). Existing profiles have been designed for wide
It is highly likely that an existing profile will meet your needs, reuse. It is highly likely that an existing profile will meet
especially given the ability to specify further excluded your needs, especially given the ability to specify further
characters (Section 6.2) and to build application-layer constructs excluded characters (Section 6.2) and to build application-layer
(see Section 6.3). constructs (see Section 6.3).
o Do specify: o Do specify:
* Exactly which entities are responsible for preparation, * Exactly which entities are responsible for preparation,
enforcement, and comparison of internationalized strings (e.g., enforcement, and comparison of internationalized strings (e.g.,
servers or clients). servers or clients).
* Exactly when those entities need to complete their tasks (e.g., * Exactly when those entities need to complete their tasks (e.g.,
a server might need to enforce the rules of a profile before a server might need to enforce the rules of a profile before
allowing a client to gain network access). allowing a client to gain network access).
* Exactly which protocol slots need to be checked against which * Exactly which protocol slots need to be checked against which
profiles (e.g., checking the address of a message's intended profiles (e.g., checking the address of a message's intended
recipient against the UsernameCaseMapped profile recipient against the UsernameCaseMapped profile
[I-D.ietf-precis-saslprepbis] of the IdentifierClass, or [PRECIS-Users-Pwds] of the IdentifierClass, or checking the
checking the password of a user against the OpaqueString password of a user against the OpaqueString profile
profile [I-D.ietf-precis-saslprepbis] of the FreeformClass). [PRECIS-Users-Pwds] of the FreeformClass).
See [I-D.ietf-precis-saslprepbis] and [I-D.ietf-xmpp-6122bis] for See [PRECIS-Users-Pwds] and [XMPP-Addr-Format] for definitions of
definitions of these matters for several applications. these matters for several applications.
6.2. Further Excluded Characters 6.2. Further Excluded Characters
An application protocol that uses a profile MAY specify particular An application protocol that uses a profile MAY specify particular
code points that are not allowed in relevant slots within that code points that are not allowed in relevant slots within that
application protocol, above and beyond those excluded by the string application protocol, above and beyond those excluded by the string
class or profile. class or profile.
That is, an application protocol MAY do either of the following: That is, an application protocol MAY do either of the following:
skipping to change at page 18, line 16 skipping to change at page 18, line 35
string class. string class.
2. Exclude characters matching certain Unicode properties (e.g., 2. Exclude characters matching certain Unicode properties (e.g.,
math symbols) that are included in the relevant PRECIS string math symbols) that are included in the relevant PRECIS string
class. class.
As a result of such exclusions, code points that are defined as valid As a result of such exclusions, code points that are defined as valid
for the PRECIS string class or profile will be defined as disallowed for the PRECIS string class or profile will be defined as disallowed
for the relevant protocol slot. for the relevant protocol slot.
Typically, such exclusions are defined for the purpose of backward- Typically, such exclusions are defined for the purpose of backward
compatibility with legacy formats within an application protocol. compatibility with legacy formats within an application protocol.
These are defined for application protocols, not profiles, in order These are defined for application protocols, not profiles, in order
to prevent multiplication of profiles beyond necessity (see to prevent multiplication of profiles beyond necessity (see
Section 5.1). Section 5.1).
6.3. Building Application-Layer Constructs 6.3. Building Application-Layer Constructs
Sometimes, an application-layer construct does not map in a Sometimes, an application-layer construct does not map in a
straightforward manner to one of the base string classes or a profile straightforward manner to one of the base string classes or a profile
thereof. Consider, for example, the "simple user name" construct in thereof. Consider, for example, the "simple user name" construct in
the Simple Authentication and Security Layer (SASL) [RFC4422]. the Simple Authentication and Security Layer (SASL) [RFC4422].
Depending on the deployment, a simple user name might take the form Depending on the deployment, a simple user name might take the form
of a user's full name (e.g., the user's personal name followed by a of a user's full name (e.g., the user's personal name followed by a
space and then the user's family name). Such a simple user name space and then the user's family name). Such a simple user name
cannot be defined as an instance of the IdentifierClass or a profile cannot be defined as an instance of the IdentifierClass or a profile
thereof, since space characters are not allowed in the thereof, since space characters are not allowed in the
IdentifierClass; however, it could be defined using a space-separated IdentifierClass; however, it could be defined using a space-separated
sequence of IdentifierClass instances, as in the following ABNF sequence of IdentifierClass instances, as in the following ABNF
[RFC5234] from [I-D.ietf-precis-saslprepbis]: [RFC5234] from [PRECIS-Users-Pwds]:
username = userpart *(1*SP userpart) username = userpart *(1*SP userpart)
userpart = 1*(idbyte) userpart = 1*(idbyte)
; ;
; an "idbyte" is a byte used to represent a ; an "idbyte" is a byte used to represent a
; UTF-8 encoded Unicode code point that can be ; UTF-8 encoded Unicode code point that can be
; contained in a string that conforms to the ; contained in a string that conforms to the
; PRECIS "IdentifierClass" ; PRECIS "IdentifierClass"
; ;
skipping to change at page 19, line 28 skipping to change at page 19, line 43
5. Directionality Rule 5. Directionality Rule
6. Behavioral rules for determining whether a code point is valid, 6. Behavioral rules for determining whether a code point is valid,
allowed under a contextual rule, disallowed, or unassigned allowed under a contextual rule, disallowed, or unassigned
As already described, the width mapping, additional mapping, case As already described, the width mapping, additional mapping, case
mapping, normalization, and directionality rules are specified for mapping, normalization, and directionality rules are specified for
each profile, whereas the behavioral rules are specified for each each profile, whereas the behavioral rules are specified for each
string class. Some of the logic behind this order is provided under string class. Some of the logic behind this order is provided under
Section 5.2.1 (see also the PRECIS mappings document Section 5.2.1 (see also the PRECIS mappings document
[I-D.ietf-precis-mappings]). [PRECIS-Mappings]).
8. Code Point Properties 8. Code Point Properties
In order to implement the string classes described above, this In order to implement the string classes described above, this
document does the following: document does the following:
1. Reviews and classifies the collections of code points in the 1. Reviews and classifies the collections of code points in the
Unicode character set by examining various code point properties. Unicode character set by examining various code point properties.
2. Defines an algorithm for determining a derived property value, 2. Defines an algorithm for determining a derived property value,
skipping to change at page 20, line 16 skipping to change at page 20, line 39
be used in specific string classes. In the remainder of this be used in specific string classes. In the remainder of this
document, the abbreviated term *_PVAL is used, where * = (ID | document, the abbreviated term *_PVAL is used, where * = (ID |
FREE), i.e., either "FREE_PVAL" or "ID_PVAL". In practice, the FREE), i.e., either "FREE_PVAL" or "ID_PVAL". In practice, the
derived property ID_PVAL is not used in this specification, since derived property ID_PVAL is not used in this specification, since
every ID_PVAL code point is PVALID. every ID_PVAL code point is PVALID.
CONTEXTUAL RULE REQUIRED Some characteristics of the character, such CONTEXTUAL RULE REQUIRED Some characteristics of the character, such
as its being invisible in certain contexts or problematic in as its being invisible in certain contexts or problematic in
others, require that it not be used in labels unless specific others, require that it not be used in labels unless specific
other characters or properties are present. As in IDNA2008, there other characters or properties are present. As in IDNA2008, there
are two subdivisions of CONTEXTUAL RULE REQUIRED, the first for are two subdivisions of CONTEXTUAL RULE REQUIRED -- the first for
Join_controls (called "CONTEXTJ") and the second for other Join_controls (called "CONTEXTJ") and the second for other
characters (called "CONTEXTO"). A character with the derived characters (called "CONTEXTO"). A character with the derived
property value CONTEXTJ or CONTEXTO MUST NOT be used unless an property value CONTEXTJ or CONTEXTO MUST NOT be used unless an
appropriate rule has been established and the context of the appropriate rule has been established and the context of the
character is consistent with that rule. The most notable of the character is consistent with that rule. The most notable of the
CONTEXTUAL RULE REQUIRED characters are the Join Control CONTEXTUAL RULE REQUIRED characters are the Join Control
characters U+200D ZERO WIDTH JOINER and U+200C ZERO WIDTH NON- characters U+200D ZERO WIDTH JOINER and U+200C ZERO WIDTH
JOINER, which have a derived property value of CONTEXTJ. See NON-JOINER, which have a derived property value of CONTEXTJ. See
Appendix A of [RFC5892] for more information. Appendix A of [RFC5892] for more information.
DISALLOWED Those code points that are not permitted in any PRECIS DISALLOWED Those code points that are not permitted in any PRECIS
string class. string class.
SPECIFIC CLASS DISALLOWED Those code points that are not to be SPECIFIC CLASS DISALLOWED Those code points that are not to be
included in one of the string classes but that might be permitted included in one of the string classes but that might be permitted
in others. In the remainder of this document, the abbreviated in others. In the remainder of this document, the abbreviated
term *_DIS is used, where * = (ID | FREE), i.e., either "FREE_DIS" term *_DIS is used, where * = (ID | FREE), i.e., either "FREE_DIS"
or "ID_DIS". In practice, the derived property FREE_DIS is not or "ID_DIS". In practice, the derived property FREE_DIS is not
used in this specification, since every FREE_DIS code point is used in this specification, since every FREE_DIS code point is
DISALLOWED. DISALLOWED.
UNASSIGNED Those code points that are not designated (i.e. are UNASSIGNED Those code points that are not designated (i.e., are
unassigned) in the Unicode Standard. unassigned) in the Unicode Standard.
The algorithm to calculate the value of the derived property is as The algorithm to calculate the value of the derived property is as
follows (implementations MUST NOT modify the order of operations follows (implementations MUST NOT modify the order of operations
within this algorithm, since doing so would cause inconsistent within this algorithm, since doing so would cause inconsistent
results across implementations): results across implementations):
If .cp. .in. Exceptions Then Exceptions(cp); If .cp. .in. Exceptions Then Exceptions(cp);
Else If .cp. .in. BackwardCompatible Then BackwardCompatible(cp); Else If .cp. .in. BackwardCompatible Then BackwardCompatible(cp);
Else If .cp. .in. Unassigned Then UNASSIGNED; Else If .cp. .in. Unassigned Then UNASSIGNED;
skipping to change at page 21, line 45 skipping to change at page 22, line 12
function call (such as "Exceptions(cp)") implies the value that the function call (such as "Exceptions(cp)") implies the value that the
code point has in the Exceptions table. code point has in the Exceptions table.
The mechanisms described here allow determination of the value of the The mechanisms described here allow determination of the value of the
property for future versions of Unicode (including characters added property for future versions of Unicode (including characters added
after Unicode 5.2 or 7.0 depending on the category, since some after Unicode 5.2 or 7.0 depending on the category, since some
categories mentioned in this document are simply pointers to IDNA2008 categories mentioned in this document are simply pointers to IDNA2008
and therefore were defined at the time of Unicode 5.2). Changes in and therefore were defined at the time of Unicode 5.2). Changes in
Unicode properties that do not affect the outcome of this process Unicode properties that do not affect the outcome of this process
therefore do not affect this framework. For example, a character can therefore do not affect this framework. For example, a character can
have its Unicode General_Category value (see Chapter 4 of the Unicode have its Unicode General_Category value (at the time of this writing,
Standard [Unicode7.0]) change from So to Sm, or from Lo to Ll, see Chapter 4 of [Unicode7.0]) change from So to Sm, or from Lo to
without affecting the algorithm results. Moreover, even if such Ll, without affecting the algorithm results. Moreover, even if such
changes were to result, the BackwardCompatible list (Section 9.7) can changes were to result, the BackwardCompatible list (Section 9.7) can
be adjusted to ensure the stability of the results. be adjusted to ensure the stability of the results.
9. Category Definitions Used to Calculate Derived Property 9. Category Definitions Used to Calculate Derived Property
The derived property obtains its value based on a two-step procedure: The derived property obtains its value based on a two-step procedure:
1. Characters are placed in one or more character categories either 1. Characters are placed in one or more character categories either
(1) based on core properties defined by the Unicode Standard or (1) based on core properties defined by the Unicode Standard or
(2) by treating the code point as an exception and addressing the (2) by treating the code point as an exception and addressing the
skipping to change at page 22, line 27 skipping to change at page 22, line 40
operations are specified under Section 8. operations are specified under Section 8.
Note: Unicode property names and property value names might have Note: Unicode property names and property value names might have
short abbreviations, such as "gc" for the General_Category short abbreviations, such as "gc" for the General_Category
property and "Ll" for the Lowercase_Letter property value of the property and "Ll" for the Lowercase_Letter property value of the
gc property. gc property.
In the following specification of character categories, the operation In the following specification of character categories, the operation
that returns the value of a particular Unicode character property for that returns the value of a particular Unicode character property for
a code point is designated by using the formal name of that property a code point is designated by using the formal name of that property
(from the Unicode PropertyAliases.txt [1]) followed by '(cp)' for (from the Unicode PropertyAliases.txt file [PropertyAliases] followed
"code point". For example, the value of the General_Category by "(cp)" for "code point". For example, the value of the
property for a code point is indicated by General_Category(cp). General_Category property for a code point is indicated by
General_Category(cp).
The first ten categories (A-J) shown below were previously defined The first ten categories (A-J) shown below were previously defined
for IDNA2008 and are referenced from [RFC5892] to ease the for IDNA2008 and are referenced from [RFC5892] to ease the
understanding of how PRECIS handles various characters. Some of understanding of how PRECIS handles various characters. Some of
these categories are reused in PRECIS and some of them are not; these categories are reused in PRECIS, and some of them are not;
however, the lettering of categories is retained to prevent overlap however, the lettering of categories is retained to prevent overlap
and to ease implementation of both IDNA2008 and PRECIS in a single and to ease implementation of both IDNA2008 and PRECIS in a single
software application. The next eight categories (K-R) are specific software application. The next eight categories (K-R) are specific
to PRECIS. to PRECIS.
9.1. LetterDigits (A) 9.1. LetterDigits (A)
This category is defined in Section 2.1 of [RFC5892] and is included This category is defined in Section 2.1 of [RFC5892] and is included
by reference for use in PRECIS. by reference for use in PRECIS.
9.2. Unstable (B) 9.2. Unstable (B)
This category is defined in Section 2.2 of [RFC5892]. However, it is This category is defined in Section 2.2 of [RFC5892]. However, it is
not used in PRECIS. not used in PRECIS.
9.3. IgnorableProperties (C) 9.3. IgnorableProperties (C)
This category is defined in Section 2.3 of [RFC5892]. However, it is This category is defined in Section 2.3 of [RFC5892]. However, it is
not used in PRECIS. not used in PRECIS.
Note: See the "PrecisIgnorableProperties (M)" category below for a Note: See the PrecisIgnorableProperties ("M") category below for a
more inclusive category used in PRECIS identifiers. more inclusive category used in PRECIS identifiers.
9.4. IgnorableBlocks (D) 9.4. IgnorableBlocks (D)
This category is defined in Section 2.4 of [RFC5892]. However, it is This category is defined in Section 2.4 of [RFC5892]. However, it is
not used in PRECIS. not used in PRECIS.
9.5. LDH (E) 9.5. LDH (E)
This category is defined in Section 2.5 of [RFC5892]. However, it is This category is defined in Section 2.5 of [RFC5892]. However, it is
not used in PRECIS. not used in PRECIS.
Note: See the "ASCII7 (K)" category below for a more inclusive Note: See the ASCII7 ("K") category below for a more inclusive
category used in PRECIS identifiers. category used in PRECIS identifiers.
9.6. Exceptions (F) 9.6. Exceptions (F)
This category is defined in Section 2.6 of [RFC5892] and is included This category is defined in Section 2.6 of [RFC5892] and is included
by reference for use in PRECIS. by reference for use in PRECIS.
9.7. BackwardCompatible (G) 9.7. BackwardCompatible (G)
This category is defined in Section 2.7 of [RFC5892] and is included This category is defined in Section 2.7 of [RFC5892] and is included
by reference for use in PRECIS. by reference for use in PRECIS.
Note: Management of this category is handled via the processes Note: Management of this category is handled via the processes
specified in [RFC5892]. At the time of this writing (and also at the specified in [RFC5892]. At the time of this writing (and also at the
time that RFC 5892 was published), this category consisted of the time that RFC 5892 was published), this category consisted of the
empty set; however, that is subject to change as described in RFC empty set; however, that is subject to change as described in
5892. RFC 5892.
9.8. JoinControl (H) 9.8. JoinControl (H)
This category is defined in Section 2.8 of [RFC5892] and is included This category is defined in Section 2.8 of [RFC5892] and is included
by reference for use in PRECIS. by reference for use in PRECIS.
9.9. OldHangulJamo (I) 9.9. OldHangulJamo (I)
This category is defined in Section 2.9 of [RFC5892] and is included This category is defined in Section 2.9 of [RFC5892] and is included
by reference for use in PRECIS. by reference for use in PRECIS.
skipping to change at page 24, line 36 skipping to change at page 24, line 46
9.13. PrecisIgnorableProperties (M) 9.13. PrecisIgnorableProperties (M)
This PRECIS-specific category is used to group code points that are This PRECIS-specific category is used to group code points that are
discouraged from use in PRECIS string classes. discouraged from use in PRECIS string classes.
M: Default_Ignorable_Code_Point(cp) = True or M: Default_Ignorable_Code_Point(cp) = True or
Noncharacter_Code_Point(cp) = True Noncharacter_Code_Point(cp) = True
The definition for Default_Ignorable_Code_Point can be found in the The definition for Default_Ignorable_Code_Point can be found in the
DerivedCoreProperties.txt [2] file. DerivedCoreProperties.txt file [DerivedCoreProperties].
9.14. Spaces (N) 9.14. Spaces (N)
This PRECIS-specific category is used to group code points that are This PRECIS-specific category is used to group code points that are
space characters. space characters.
N: General_Category(cp) is in {Zs} N: General_Category(cp) is in {Zs}
9.15. Symbols (O) 9.15. Symbols (O)
skipping to change at page 25, line 15 skipping to change at page 25, line 29
9.16. Punctuation (P) 9.16. Punctuation (P)
This PRECIS-specific category is used to group code points that are This PRECIS-specific category is used to group code points that are
punctuation characters. punctuation characters.
P: General_Category(cp) is in {Pc, Pd, Ps, Pe, Pi, Pf, Po} P: General_Category(cp) is in {Pc, Pd, Ps, Pe, Pi, Pf, Po}
9.17. HasCompat (Q) 9.17. HasCompat (Q)
This PRECIS-specific category is used to group code points that have This PRECIS-specific category is used to group code points that have
compatibility equivalents as explained in Chapter 2 and Chapter 3 of compatibility equivalents as explained in the Unicode Standard (at
the Unicode Standard [Unicode7.0]. the time of this writing, see Chapters 2 and 3 of [Unicode7.0]).
Q: toNFKC(cp) != cp Q: toNFKC(cp) != cp
The toNFKC() operation returns the code point in normalization form The toNFKC() operation returns the code point in normalization
KC. For more information, see Section 5 of Unicode Standard Annex form KC. For more information, see Section 5 of Unicode Standard
#15 [UAX15]. Annex #15 [UAX15].
9.18. OtherLetterDigits (R) 9.18. OtherLetterDigits (R)
This PRECIS-specific category is used to group code points that are This PRECIS-specific category is used to group code points that are
letters and digits other than the "traditional" letters and digits letters and digits other than the "traditional" letters and digits
grouped under the LetterDigits (A) class (see Section 9.1). grouped under the LetterDigits (A) class (see Section 9.1).
R: General_Category(cp) is in {Lt, Nl, No, Me} R: General_Category(cp) is in {Lt, Nl, No, Me}
10. Guidelines for Designated Experts 10. Guidelines for Designated Experts
skipping to change at page 25, line 44 skipping to change at page 26, line 17
Experience with internationalization in application protocols has Experience with internationalization in application protocols has
shown that protocol designers and application developers usually do shown that protocol designers and application developers usually do
not understand the subtleties and tradeoffs involved with not understand the subtleties and tradeoffs involved with
internationalization and that they need considerable guidance in internationalization and that they need considerable guidance in
making reasonable decisions with regard to the options before them. making reasonable decisions with regard to the options before them.
Therefore: Therefore:
o Protocol designers are strongly encouraged to question the o Protocol designers are strongly encouraged to question the
assumption that they need to define new profiles, since existing assumption that they need to define new profiles, since existing
profiles are designed for wide re-use (see Section 5 for further profiles are designed for wide reuse (see Section 5 for further
discussion). discussion).
o Those who persist in defining new profiles are strongly encouraged o Those who persist in defining new profiles are strongly encouraged
to clearly explain a strong justification for doing so, and to to clearly explain a strong justification for doing so, and to
publish a stable specification that provides all of the publish a stable specification that provides all of the
information described under Section 11.3. information described under Section 11.3.
o The designated experts for profile registration requests ought to o The designated experts for profile registration requests ought to
seek answers to all of the questions provided under Section 11.3 seek answers to all of the questions provided under Section 11.3
and to encourage applicants to provide a stable specification and to encourage applicants to provide a stable specification
skipping to change at page 26, line 35 skipping to change at page 27, line 9
understanding to achieve rough consensus on profile registration understanding to achieve rough consensus on profile registration
requests and the use of PRECIS in particular applications. They are requests and the use of PRECIS in particular applications. They are
also encouraged to bring additional expertise into the discussion if also encouraged to bring additional expertise into the discussion if
that would be helpful in adding perspective or otherwise resolving that would be helpful in adding perspective or otherwise resolving
issues. issues.
11. IANA Considerations 11. IANA Considerations
11.1. PRECIS Derived Property Value Registry 11.1. PRECIS Derived Property Value Registry
IANA is requested to create a PRECIS-specific registry with the IANA has created and now maintains the "PRECIS Derived Property
Derived Properties for the versions of Unicode that are released Value" registry that records the derived properties for the versions
after (and including) version 7.0. The derived property value is to of Unicode that are released after (and including) version 7.0. The
be calculated in cooperation with a designated expert [RFC5226] derived property value is to be calculated in cooperation with a
according to the rules specified under Section 8 and Section 9. designated expert [RFC5226] according to the rules specified under
Sections 8 and 9.
The IESG is to be notified if backward-incompatible changes to the The IESG is to be notified if backward-incompatible changes to the
table of derived properties are discovered or if other problems arise table of derived properties are discovered or if other problems arise
during the process of creating the table of derived property values during the process of creating the table of derived property values
or during expert review. Changes to the rules defined under or during expert review. Changes to the rules defined under
Section 8 and Section 9 require IETF Review. Sections 8 and 9 require IETF Review.
11.2. PRECIS Base Classes Registry 11.2. PRECIS Base Classes Registry
IANA is requested to create a registry of PRECIS string classes. In IANA has created the "PRECIS Base Classes" registry. In accordance
accordance with [RFC5226], the registration policy is "RFC Required". with [RFC5226], the registration policy is "RFC Required".
The registration template is as follows: The registration template is as follows:
Base Class: [the name of the PRECIS string class] Base Class: [the name of the PRECIS string class]
Description: [a brief description of the PRECIS string class and its Description: [a brief description of the PRECIS string class and its
intended use, e.g., "A sequence of letters, numbers, and symbols intended use, e.g., "A sequence of letters, numbers, and symbols
that is used to identify or address a network entity."] that is used to identify or address a network entity."]
Specification: [the RFC number] Specification: [the RFC number]
The initial registrations are as follows: The initial registrations are as follows:
Base Class: FreeformClass. Base Class: FreeformClass.
Description: A sequence of letters, numbers, symbols, spaces, and Description: A sequence of letters, numbers, symbols, spaces, and
other code points that is used for free-form strings. other code points that is used for free-form strings.
Specification: Section 4.3 of this document. Specification: Section 4.3 of RFC 7564.
[Note to RFC Editor: please change "this document"
to the RFC number issued for this specification.]
Base Class: IdentifierClass. Base Class: IdentifierClass.
Description: A sequence of letters, numbers, and symbols that is Description: A sequence of letters, numbers, and symbols that is
used to identify or address a network entity. used to identify or address a network entity.
Specification: Section 4.2 of this document. Specification: Section 4.2 of RFC 7564.
[Note to RFC Editor: please change "this document"
to the RFC number issued for this specification.]
11.3. PRECIS Profiles Registry 11.3. PRECIS Profiles Registry
IANA is requested to create a registry of profiles that use the IANA has created the "PRECIS Profiles" registry to identify profiles
PRECIS string classes. In accordance with [RFC5226], the that use the PRECIS string classes. In accordance with [RFC5226],
registration policy is "Expert Review". This policy was chosen in the registration policy is "Expert Review". This policy was chosen
order to ease the burden of registration while ensuring that in order to ease the burden of registration while ensuring that
"customers" of PRECIS receive appropriate guidance regarding the "customers" of PRECIS receive appropriate guidance regarding the
sometimes complex and subtle internationalization issues related to sometimes complex and subtle internationalization issues related to
profiles of PRECIS string classes. profiles of PRECIS string classes.
The registration template is as follows: The registration template is as follows:
Name: [the name of the profile] Name: [the name of the profile]
Base Class: [which PRECIS string class is being profiled] Base Class: [which PRECIS string class is being profiled]
Applicability: [the specific protocol elements to which this profile Applicability: [the specific protocol elements to which this profile
applies, e.g., "Localparts in XMPP addresses."] applies, e.g., "Localparts in XMPP addresses."]
Replaces: [the Stringprep profile that this PRECIS profile replaces, Replaces: [the Stringprep profile that this PRECIS profile replaces,
if any] if any]
Width Mapping Rule: [the behavioral rule for handling of width, Width Mapping Rule: [the behavioral rule for handling of width,
e.g., "Map fullwidth and halfwidth characters to their e.g., "Map fullwidth and halfwidth characters to their
compatibility variants."] compatibility variants."]
Additional Mapping Rule: [any additional mappings are required or Additional Mapping Rule: [any additional mappings that are required
recommended, e.g., "Map non-ASCII space characters to ASCII or recommended, e.g., "Map non-ASCII space characters to ASCII
space."] space."]
Case Mapping Rule: [the behavioral rule for handling of case, e.g., Case Mapping Rule: [the behavioral rule for handling of case, e.g.,
"Unicode Default Case Folding"] "Unicode Default Case Folding"]
Normalization Rule: [which Unicode normalization form is applied, Normalization Rule: [which Unicode normalization form is applied,
e.g., "NFC"] e.g., "NFC"]
Directionality Rule: [the behavioral rule for handling of right-to- Directionality Rule: [the behavioral rule for handling of right-to-
left code points, e.g., "The 'Bidi Rule' defined in RFC 5893 left code points, e.g., "The 'Bidi Rule' defined in RFC 5893
skipping to change at page 28, line 39 skipping to change at page 29, line 12
In order to request a review, the registrant shall send a completed In order to request a review, the registrant shall send a completed
template to the precis@ietf.org list or its designated successor. template to the precis@ietf.org list or its designated successor.
Factors to focus on while defining profiles and reviewing profile Factors to focus on while defining profiles and reviewing profile
registrations include the following: registrations include the following:
o Would an existing PRECIS string class or profile solve the o Would an existing PRECIS string class or profile solve the
problem? If not, why not? (See Section 5.1 for related problem? If not, why not? (See Section 5.1 for related
considerations.) considerations.)
o Is the problem being addressed by this profile well-defined? o Is the problem being addressed by this profile well defined?
o Does the specification define what kinds of applications are o Does the specification define what kinds of applications are
involved and the protocol elements to which this profile applies? involved and the protocol elements to which this profile applies?
o Is the profile clearly defined? o Is the profile clearly defined?
o Is the profile based on an appropriate dividing line between user o Is the profile based on an appropriate dividing line between user
interface (culture, context, intent, locale, device limitations, interface (culture, context, intent, locale, device limitations,
etc.) and the use of conformant strings in protocol elements? etc.) and the use of conformant strings in protocol elements?
skipping to change at page 29, line 25 skipping to change at page 29, line 47
12. Security Considerations 12. Security Considerations
12.1. General Issues 12.1. General Issues
If input strings that appear "the same" to users are programmatically If input strings that appear "the same" to users are programmatically
considered to be distinct in different systems, or if input strings considered to be distinct in different systems, or if input strings
that appear distinct to users are programmatically considered to be that appear distinct to users are programmatically considered to be
"the same" in different systems, then users can be confused. Such "the same" in different systems, then users can be confused. Such
confusion can have security implications, such as the false positives confusion can have security implications, such as the false positives
and false negatieves discussed in [RFC6943]. One starting goal of and false negatives discussed in [RFC6943]. One starting goal of
work on the PRECIS framework was to limit the number of times that work on the PRECIS framework was to limit the number of times that
users are confused (consistent with the "Principle of Least users are confused (consistent with the "Principle of Least
Astonishment"). Unfortunately, this goal has been difficult to Astonishment"). Unfortunately, this goal has been difficult to
achieve given the large number of application protocols already in achieve given the large number of application protocols already in
existence. Despite these difficulties, profiles should not be existence. Despite these difficulties, profiles should not be
multiplied beyond necessity (see Section 5.1. In particular, multiplied beyond necessity (see Section 5.1). In particular,
application protocol designers should think long and hard before application protocol designers should think long and hard before
defining a new profile instead of using one that has already been defining a new profile instead of using one that has already been
defined, and if they decide to define a new profile then they should defined, and if they decide to define a new profile then they should
clearly explain their reasons for doing so. clearly explain their reasons for doing so.
The security of applications that use this framework can depend in The security of applications that use this framework can depend in
part on the proper preparation, enforcement, and comparison of part on the proper preparation, enforcement, and comparison of
internationalized strings. For example, such strings can be used to internationalized strings. For example, such strings can be used to
make authentication and authorization decisions, and the security of make authentication and authorization decisions, and the security of
an application could be compromised if an entity providing a given an application could be compromised if an entity providing a given
skipping to change at page 30, line 10 skipping to change at page 30, line 30
used in the protocol, including the security implications of any used in the protocol, including the security implications of any
false positives and false negatives that might result from various false positives and false negatives that might result from various
enforcement and comparison operations. For some helpful guidelines, enforcement and comparison operations. For some helpful guidelines,
refer to [RFC6943], [RFC5890], [UTR36], and [UTS39]. refer to [RFC6943], [RFC5890], [UTR36], and [UTS39].
12.2. Use of the IdentifierClass 12.2. Use of the IdentifierClass
Strings that conform to the IdentifierClass and any profile thereof Strings that conform to the IdentifierClass and any profile thereof
are intended to be relatively safe for use in a broad range of are intended to be relatively safe for use in a broad range of
applications, primarily because they include only letters, digits, applications, primarily because they include only letters, digits,
and "grandfathered" non-space characters from the ASCII range; thus and "grandfathered" non-space characters from the ASCII range; thus,
they exclude spaces, characters with compatibility equivalents, and they exclude spaces, characters with compatibility equivalents, and
almost all symbols and punctuation marks. However, because such almost all symbols and punctuation marks. However, because such
strings can still include so-called confusable characters (see strings can still include so-called confusable characters (see
Section 12.5), protocol designers and implementers are encouraged to Section 12.5), protocol designers and implementers are encouraged to
pay close attention to the security considerations described pay close attention to the security considerations described
elsewhere in this document. elsewhere in this document.
12.3. Use of the FreeformClass 12.3. Use of the FreeformClass
Strings that conform to the FreeformClass and many profiles thereof Strings that conform to the FreeformClass and many profiles thereof
can include virtually any Unicode character. This makes the can include virtually any Unicode character. This makes the
FreeformClass quite expressive, but also problematic from the FreeformClass quite expressive, but also problematic from the
perspective of possible user confusion. Protocol designers are perspective of possible user confusion. Protocol designers are
hereby warned that the FreeformClass contains codepoints they might hereby warned that the FreeformClass contains code points they might
not understand, and are encouraged to profile the IdentifierClass not understand, and are encouraged to profile the IdentifierClass
wherever feasible; however, if an application protocol requires more wherever feasible; however, if an application protocol requires more
code points than are allowed by the IdentifierClass, protocol code points than are allowed by the IdentifierClass, protocol
designers are encouraged to define a profile of the FreeformClass designers are encouraged to define a profile of the FreeformClass
that restricts the allowable code points as tightly as possible. that restricts the allowable code points as tightly as possible.
(The PRECIS Working Group considered the option of allowing (The PRECIS Working Group considered the option of allowing
"superclasses" as well as profiles of PRECIS string classes, but "superclasses" as well as profiles of PRECIS string classes, but
decided against allowing superclasses to reduce the likelihood of decided against allowing superclasses to reduce the likelihood of
security and interoperability problems.) security and interoperability problems.)
12.4. Local Character Set Issues 12.4. Local Character Set Issues
When systems use local character sets other than ASCII and Unicode, When systems use local character sets other than ASCII and Unicode,
this specification leaves the problem of converting between the local this specification leaves the problem of converting between the local
character set and Unicode up to the application or local system. If character set and Unicode up to the application or local system. If
skipping to change at page 31, line 26 skipping to change at page 31, line 50
Cherokee block look similar to the ASCII characters "STPETER" as they Cherokee block look similar to the ASCII characters "STPETER" as they
might appear when presented using a "creative" font family. might appear when presented using a "creative" font family.
In some examples of confusable characters, it is unlikely that the In some examples of confusable characters, it is unlikely that the
average human could tell the difference between the real string and average human could tell the difference between the real string and
the fake string. (Indeed, there is no programmatic way to the fake string. (Indeed, there is no programmatic way to
distinguish with full certainty which is the fake string and which is distinguish with full certainty which is the fake string and which is
the real string; in some contexts, the string formed of Cherokee the real string; in some contexts, the string formed of Cherokee
characters might be the real string and the string formed of ASCII characters might be the real string and the string formed of ASCII
characters might be the fake string.) Because PRECIS-compliant characters might be the fake string.) Because PRECIS-compliant
strings can contain almost any properly-encoded Unicode code point, strings can contain almost any properly encoded Unicode code point,
it can be relatively easy to fake or mimic some strings in systems it can be relatively easy to fake or mimic some strings in systems
that use the PRECIS framework. The fact that some strings are easily that use the PRECIS framework. The fact that some strings are easily
confused introduces security vulnerabilities of the kind that have confused introduces security vulnerabilities of the kind that have
also plagued the World Wide Web, specifically the phenomenon known as also plagued the World Wide Web, specifically the phenomenon known as
phishing. phishing.
Despite the fact that some specific suggestions about identification Despite the fact that some specific suggestions about identification
and handling of confusable characters appear in the Unicode Security and handling of confusable characters appear in the Unicode Security
Considerations [UTR36] and the Unicode Security Mechanisms [UTS39], Considerations [UTR36] and the Unicode Security Mechanisms [UTS39],
it is also true (as noted in [RFC5890]) that "there are no it is also true (as noted in [RFC5890]) that "there are no
comprehensive technical solutions to the problems of confusable comprehensive technical solutions to the problems of confusable
characters". Because it is impossible to map visually similar characters." Because it is impossible to map visually similar
characters without a great deal of context (such as knowing the font characters without a great deal of context (such as knowing the font
families used), the PRECIS framework does nothing to map similar- families used), the PRECIS framework does nothing to map similar-
looking characters together, nor does it prohibit some characters looking characters together, nor does it prohibit some characters
because they look like others. because they look like others.
Nevertheless, specifications for application protocols that use this Nevertheless, specifications for application protocols that use this
framework are strongly encouraged to describe how confusable framework are strongly encouraged to describe how confusable
characters can be abused to compromise the security of systems that characters can be abused to compromise the security of systems that
use the protocol in question, along with any protocol-specific use the protocol in question, along with any protocol-specific
suggestions for overcoming those threats. In particular, software suggestions for overcoming those threats. In particular, software
implementations and service deployments that use PRECIS-based implementations and service deployments that use PRECIS-based
technologies are strongly encouraged to define and implement technologies are strongly encouraged to define and implement
consistent policies regarding the registration, storage, and consistent policies regarding the registration, storage, and
presentation of visually similar characters. The following presentation of visually similar characters. The following
recommendations are appropriate: recommendations are appropriate:
1. An application service SHOULD define a policy that specifies the 1. An application service SHOULD define a policy that specifies the
scripts or blocks of characters that the service will allow to be scripts or blocks of characters that the service will allow to be
registered (e.g., in an account name) or stored (e.g., in a file registered (e.g., in an account name) or stored (e.g., in a
name). Such a policy SHOULD be informed by the languages and filename). Such a policy SHOULD be informed by the languages and
scripts that are used to write registered account names; in scripts that are used to write registered account names; in
particular, to reduce confusion, the service SHOULD forbid particular, to reduce confusion, the service SHOULD forbid
registration or storage of strings that contain characters from registration or storage of strings that contain characters from
more than one script and SHOULD restrict registrations to more than one script and SHOULD restrict registrations to
characters drawn from a very small number of scripts (e.g., characters drawn from a very small number of scripts (e.g.,
scripts that are well-understood by the administrators of the scripts that are well understood by the administrators of the
service, to improve manageability). service, to improve manageability).
2. User-oriented application software SHOULD define a policy that 2. User-oriented application software SHOULD define a policy that
specifies how internationalized strings will be presented to a specifies how internationalized strings will be presented to a
human user. Because every human user of such software has a human user. Because every human user of such software has a
preferred language or a small set of preferred languages, the preferred language or a small set of preferred languages, the
software SHOULD gather that information either explicitly from software SHOULD gather that information either explicitly from
the user or implicitly via the operating system of the user's the user or implicitly via the operating system of the user's
device. Furthermore, because most languages are typically device. Furthermore, because most languages are typically
represented by a single script or a small set of scripts, and represented by a single script or a small set of scripts, and
skipping to change at page 32, line 40 skipping to change at page 33, line 15
script or block, or that uses characters outside the normal range script or block, or that uses characters outside the normal range
of the user's preferred language(s). (Such a recommendation is of the user's preferred language(s). (Such a recommendation is
not intended to discourage communication across different not intended to discourage communication across different
communities of language users; instead, it recognizes the communities of language users; instead, it recognizes the
existence of such communities and encourages due caution when existence of such communities and encourages due caution when
presenting unfamiliar scripts or characters to human users.) presenting unfamiliar scripts or characters to human users.)
The challenges inherent in supporting the full range of Unicode code The challenges inherent in supporting the full range of Unicode code
points have in the past led some to hope for a way to points have in the past led some to hope for a way to
programmatically negotiate more restrictive ranges based on locale, programmatically negotiate more restrictive ranges based on locale,
script, or other relevant factors, to tag the locale associated with script, or other relevant factors; to tag the locale associated with
a particular string, etc. As a general-purpose internationalization a particular string; etc. As a general-purpose internationalization
technology, the PRECIS framework does not include such mechanisms. technology, the PRECIS framework does not include such mechanisms.
12.6. Security of Passwords 12.6. Security of Passwords
Two goals of passwords are to maximize the amount of entropy and to Two goals of passwords are to maximize the amount of entropy and to
minimize the potential for false positives. These goals can be minimize the potential for false positives. These goals can be
achieved in part by allowing a wide range of code points and by achieved in part by allowing a wide range of code points and by
ensuring that passwords are handled in such a way that code points ensuring that passwords are handled in such a way that code points
are not compared aggressively. Therefore, it is NOT RECOMMENDED for are not compared aggressively. Therefore, it is NOT RECOMMENDED for
application protocols to profile the FreeformClass for use in application protocols to profile the FreeformClass for use in
skipping to change at page 33, line 22 skipping to change at page 33, line 45
tradeoffs between entropy and usability. For example, allowing a tradeoffs between entropy and usability. For example, allowing a
user to establish a password containing "uncommon" code points might user to establish a password containing "uncommon" code points might
make it difficult for the user to access a service when using an make it difficult for the user to access a service when using an
unfamiliar or constrained input device. unfamiliar or constrained input device.
Some application protocols use passwords directly, whereas others Some application protocols use passwords directly, whereas others
reuse technologies that themselves process passwords (one example of reuse technologies that themselves process passwords (one example of
such a technology is the Simple Authentication and Security Layer such a technology is the Simple Authentication and Security Layer
[RFC4422]). Moreover, passwords are often carried by a sequence of [RFC4422]). Moreover, passwords are often carried by a sequence of
protocols with backend authentication systems or data storage systems protocols with backend authentication systems or data storage systems
such as RADIUS [RFC2865] and LDAP [RFC4510]. Developers of such as RADIUS [RFC2865] and the Lightweight Directory Access
application protocols are encouraged to look into reusing these Protocol (LDAP) [RFC4510]. Developers of application protocols are
profiles instead of defining new ones, so that end-user expectations encouraged to look into reusing these profiles instead of defining
about passwords are consistent no matter which application protocol new ones, so that end-user expectations about passwords are
is used. consistent no matter which application protocol is used.
In protocols that provide passwords as input to a cryptographic In protocols that provide passwords as input to a cryptographic
algorithm such as a hash function, the client will need to perform algorithm such as a hash function, the client will need to perform
proper preparation of the password before applying the algorithm, proper preparation of the password before applying the algorithm,
since the password is not available to the server in plaintext form. since the password is not available to the server in plaintext form.
Further discussion of password handling can be found in Further discussion of password handling can be found in
[I-D.ietf-precis-saslprepbis]. [PRECIS-Users-Pwds].
13. Interoperability Considerations 13. Interoperability Considerations
13.1. Encoding 13.1. Encoding
Although strings that are consumed in PRECIS-based application Although strings that are consumed in PRECIS-based application
protocols are often encoded using UTF-8 [RFC3629], the exact encoding protocols are often encoded using UTF-8 [RFC3629], the exact encoding
is a matter for the application protocol that uses PRECIS, not for is a matter for the application protocol that uses PRECIS, not for
the PRECIS framework. the PRECIS framework.
skipping to change at page 34, line 24 skipping to change at page 34, line 49
and Unicode 6.0, as described in [RFC6452]. Implementers might need and Unicode 6.0, as described in [RFC6452]. Implementers might need
to be aware that the treatment of these characters differs depending to be aware that the treatment of these characters differs depending
on which version of Unicode is available on the system that is using on which version of Unicode is available on the system that is using
IDNA2008 or PRECIS. Other such differences might arise between the IDNA2008 or PRECIS. Other such differences might arise between the
version of Unicode current at the time of this writing (7.0) and version of Unicode current at the time of this writing (7.0) and
future versions. future versions.
13.4. Potential Changes to Handling of Certain Unicode Code Points 13.4. Potential Changes to Handling of Certain Unicode Code Points
As part of the review of Unicode 7.0 for IDNA, a question was raised As part of the review of Unicode 7.0 for IDNA, a question was raised
about a newly-added code point that led to a re-analysis of the about a newly added code point that led to a re-analysis of the
Normalization Rules used by IDNA and inherited by this document normalization rules used by IDNA and inherited by this document
(Section 5.2.4). Some of the general issues are described in (Section 5.2.4). Some of the general issues are described in
[IAB-Statement] and pursued in more detail in [IAB-Statement] and pursued in more detail in [IDNA-Unicode].
[I-D.klensin-idna-5892upd-unicode70].
At the time of writing, these issues have yet to be settled. At the time of writing, these issues have yet to be settled.
However, implementers need to be aware that this specification is However, implementers need to be aware that this specification is
likely to be updated in the future to address these issues. The likely to be updated in the future to address these issues. The
potential changes include: potential changes include the following:
o The range of characters in the LetterDigits category o The range of characters in the LetterDigits category
(Section 4.2.1 and Section 9.1) might be narrowed. (Sections 4.2.1 and 9.1) might be narrowed.
o Some characters with special properties that are now allowed might o Some characters with special properties that are now allowed might
be excluded. be excluded.
o More "Additional Mapping Rules" (Section 5.2.2) might be defined. o More "Additional Mapping Rules" (Section 5.2.2) might be defined.
o Alternative normalization methods might be added. o Alternative normalization methods might be added.
Nevertheless, implementations and deployments that are sensitive to Nevertheless, implementations and deployments that are sensitive to
the advice given in this specification are unlikely to run into the advice given in this specification are unlikely to encounter
significant problems as a consequence of these issues or potential significant problems as a consequence of these issues or potential
changes - specifically the advice to use the more restrictive changes -- specifically, the advice to use the more restrictive
IdentifierClass whenever possible, or if using the FreeformClass to IdentifierClass whenever possible or, if using the FreeformClass, to
allow only a restricted set of characters, particularly avoiding allow only a restricted set of characters, particularly avoiding
characters whose implications they do not actually understand. characters whose implications they do not actually understand.
14. References 14. References
14.1. Normative References 14.1. Normative References
[RFC20] Cerf, V., "ASCII format for network interchange", RFC 20, [RFC20] Cerf, V., "ASCII format for network interchange", STD 80,
October 1969. RFC 20, DOI 10.17487/RFC0020, October 1969,
<http://www.rfc-editor.org/info/rfc20>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<http://www.rfc-editor.org/info/rfc2119>.
[RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network
Interchange", RFC 5198, March 2008. Interchange", RFC 5198, DOI 10.17487/RFC5198, March 2008,
<http://www.rfc-editor.org/info/rfc5198>.
[RFC6365] Hoffman, P. and J. Klensin, "Terminology Used in
Internationalization in the IETF", BCP 166, RFC 6365,
DOI 10.17487/RFC6365, September 2011,
<http://www.rfc-editor.org/info/rfc6365>.
[Unicode] The Unicode Consortium, "The Unicode Standard",
<http://www.unicode.org/versions/latest/>.
[Unicode7.0] [Unicode7.0]
The Unicode Consortium, "The Unicode Standard, Version The Unicode Consortium, "The Unicode Standard, Version
7.0.0", 2014, 7.0.0", (Mountain View, CA: The Unicode Consortium, 2014
ISBN 978-1-936213-09-2),
<http://www.unicode.org/versions/Unicode7.0.0/>. <http://www.unicode.org/versions/Unicode7.0.0/>.
14.2. Informative References 14.2. Informative References
[DerivedCoreProperties]
The Unicode Consortium, "DerivedCoreProperties-7.0.0.txt",
Unicode Character Database, February 2014,
<http://www.unicode.org/Public/UCD/latest/ucd/
DerivedCoreProperties.txt>.
[IAB-Statement] [IAB-Statement]
Internet Architecture Board, "IAB Statement on Identifiers Internet Architecture Board, "IAB Statement on Identifiers
and Unicode 7.0.0", January 2015, <https://www.iab.org/ and Unicode 7.0.0", February 2015, <https://www.iab.org/
documents/correspondence-reports-documents/2015-2/iab- documents/correspondence-reports-documents/
statement-on-identifiers-and-unicode-7-0-0/>. 2015-2/iab-statement-on-identifiers-and-unicode-7-0-0/>.
[I-D.ietf-precis-mappings] [IDNA-Unicode]
Yoneya, Y. and T. NEMOTO, "Mapping characters for PRECIS Klensin, J. and P. Faltstrom, "IDNA Update for Unicode
classes", draft-ietf-precis-mappings-08 (work in 7.0.0", Work in Progress,
progress), June 2014. draft-klensin-idna-5892upd-unicode70-04, March 2015.
[I-D.ietf-precis-nickname] [PRECIS-Mappings]
Saint-Andre, P., "Preparation and Comparison of Yoneya, Y. and T. Nemoto, "Mapping characters for PRECIS
Nicknames", draft-ietf-precis-nickname-14 (work in classes", Work in Progress, draft-ietf-precis-mappings-10,
progress), December 2014. May 2015.
[I-D.ietf-precis-saslprepbis] [PRECIS-Nickname]
Saint-Andre, P. and A. Melnikov, "Username and Password Saint-Andre, P., "Preparation, Enforcement, and Comparison
Preparation Algorithms", draft-ietf-precis-saslprepbis-13 of Internationalized Strings Representing Nicknames", Work
(work in progress), December 2014. in Progress, draft-ietf-precis-nickname-17, April 2015.
[I-D.ietf-xmpp-6122bis] [PRECIS-Users-Pwds]
Saint-Andre, P., "Extensible Messaging and Presence Saint-Andre, P. and A. Melnikov, "Preparation,
Protocol (XMPP): Address Format", draft-ietf-xmpp- Enforcement, and Comparison of Internationalized Strings
6122bis-18 (work in progress), December 2014. Representing Usernames and Passwords", Work in Progress,
draft-ietf-precis-saslprepbis-17, May 2015.
[I-D.klensin-idna-5892upd-unicode70] [PropertyAliases]
Klensin, J. and P. Faeltstroem, "IDNA Update for Unicode The Unicode Consortium, "PropertyAliases-7.0.0.txt",
7.0.0", draft-klensin-idna-5892upd-unicode70-03 (work in Unicode Character Database, November 2013,
progress), January 2015. <http://www.unicode.org/Public/UCD/latest/ucd/
PropertyAliases.txt>.
[RFC2865] Rigney, C., Willens, S., Rubens, A., and W. Simpson, [RFC2865] Rigney, C., Willens, S., Rubens, A., and W. Simpson,
"Remote Authentication Dial In User Service (RADIUS)", RFC "Remote Authentication Dial In User Service (RADIUS)",
2865, June 2000. RFC 2865, DOI 10.17487/RFC2865, June 2000,
<http://www.rfc-editor.org/info/rfc2865>.
[RFC3454] Hoffman, P. and M. Blanchet, "Preparation of [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of
Internationalized Strings ("stringprep")", RFC 3454, Internationalized Strings ("stringprep")", RFC 3454,
December 2002. DOI 10.17487/RFC3454, December 2002,
<http://www.rfc-editor.org/info/rfc3454>.
[RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello,
"Internationalizing Domain Names in Applications (IDNA)", "Internationalizing Domain Names in Applications (IDNA)",
RFC 3490, March 2003. RFC 3490, DOI 10.17487/RFC3490, March 2003,
<http://www.rfc-editor.org/info/rfc3490>.
[RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep
Profile for Internationalized Domain Names (IDN)", RFC Profile for Internationalized Domain Names (IDN)",
3491, March 2003. RFC 3491, DOI 10.17487/RFC3491, March 2003,
<http://www.rfc-editor.org/info/rfc3491>.
[RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
10646", STD 63, RFC 3629, November 2003. 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November
2003, <http://www.rfc-editor.org/info/rfc3629>.
[RFC4422] Melnikov, A. and K. Zeilenga, "Simple Authentication and [RFC4422] Melnikov, A., Ed., and K. Zeilenga, Ed., "Simple
Security Layer (SASL)", RFC 4422, June 2006. Authentication and Security Layer (SASL)", RFC 4422,
DOI 10.17487/RFC4422, June 2006,
<http://www.rfc-editor.org/info/rfc4422>.
[RFC4510] Zeilenga, K., "Lightweight Directory Access Protocol [RFC4510] Zeilenga, K., Ed., "Lightweight Directory Access Protocol
(LDAP): Technical Specification Road Map", RFC 4510, June (LDAP): Technical Specification Road Map", RFC 4510,
2006. DOI 10.17487/RFC4510, June 2006,
<http://www.rfc-editor.org/info/rfc4510>.
[RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and [RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and
Recommendations for Internationalized Domain Names Recommendations for Internationalized Domain Names
(IDNs)", RFC 4690, September 2006. (IDNs)", RFC 4690, DOI 10.17487/RFC4690, September 2006,
<http://www.rfc-editor.org/info/rfc4690>.
[RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an
IANA Considerations Section in RFCs", BCP 26, RFC 5226, IANA Considerations Section in RFCs", BCP 26, RFC 5226,
May 2008. DOI 10.17487/RFC5226, May 2008,
<http://www.rfc-editor.org/info/rfc5226>.
[RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax [RFC5234] Crocker, D., Ed., and P. Overell, "Augmented BNF for
Specifications: ABNF", STD 68, RFC 5234, January 2008. Syntax Specifications: ABNF", STD 68, RFC 5234,
DOI 10.17487/RFC5234, January 2008,
<http://www.rfc-editor.org/info/rfc5234>.
[RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security
(TLS) Protocol Version 1.2", RFC 5246, August 2008. (TLS) Protocol Version 1.2", RFC 5246,
DOI 10.17487/RFC5246, August 2008,
<http://www.rfc-editor.org/info/rfc5246>.
[RFC5890] Klensin, J., "Internationalized Domain Names for [RFC5890] Klensin, J., "Internationalized Domain Names for
Applications (IDNA): Definitions and Document Framework", Applications (IDNA): Definitions and Document Framework",
RFC 5890, August 2010. RFC 5890, DOI 10.17487/RFC5890, August 2010,
<http://www.rfc-editor.org/info/rfc5890>.
[RFC5891] Klensin, J., "Internationalized Domain Names in [RFC5891] Klensin, J., "Internationalized Domain Names in
Applications (IDNA): Protocol", RFC 5891, August 2010. Applications (IDNA): Protocol", RFC 5891,
DOI 10.17487/RFC5891, August 2010,
<http://www.rfc-editor.org/info/rfc5891>.
[RFC5892] Faltstrom, P., "The Unicode Code Points and [RFC5892] Faltstrom, P., Ed., "The Unicode Code Points and
Internationalized Domain Names for Applications (IDNA)", Internationalized Domain Names for Applications (IDNA)",
RFC 5892, August 2010. RFC 5892, DOI 10.17487/RFC5892, August 2010,
<http://www.rfc-editor.org/info/rfc5892>.
[RFC5893] Alvestrand, H. and C. Karp, "Right-to-Left Scripts for [RFC5893] Alvestrand, H., Ed., and C. Karp, "Right-to-Left Scripts
Internationalized Domain Names for Applications (IDNA)", for Internationalized Domain Names for Applications
RFC 5893, August 2010. (IDNA)", RFC 5893, DOI 10.17487/RFC5893, August 2010,
<http://www.rfc-editor.org/info/rfc5893>.
[RFC5894] Klensin, J., "Internationalized Domain Names for [RFC5894] Klensin, J., "Internationalized Domain Names for
Applications (IDNA): Background, Explanation, and Applications (IDNA): Background, Explanation, and
Rationale", RFC 5894, August 2010. Rationale", RFC 5894, DOI 10.17487/RFC5894, August 2010,
<http://www.rfc-editor.org/info/rfc5894>.
[RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for [RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for
Internationalized Domain Names in Applications (IDNA) Internationalized Domain Names in Applications (IDNA)
2008", RFC 5895, September 2010. 2008", RFC 5895, DOI 10.17487/RFC5895, September 2010,
<http://www.rfc-editor.org/info/rfc5895>.
[RFC6365] Hoffman, P. and J. Klensin, "Terminology Used in
Internationalization in the IETF", BCP 166, RFC 6365,
September 2011.
[RFC6452] Faltstrom, P. and P. Hoffman, "The Unicode Code Points and [RFC6452] Faltstrom, P., Ed., and P. Hoffman, Ed., "The Unicode Code
Internationalized Domain Names for Applications (IDNA) - Points and Internationalized Domain Names for Applications
Unicode 6.0", RFC 6452, November 2011. (IDNA) - Unicode 6.0", RFC 6452, DOI 10.17487/RFC6452,
November 2011, <http://www.rfc-editor.org/info/rfc6452>.
[RFC6885] Blanchet, M. and A. Sullivan, "Stringprep Revision and [RFC6885] Blanchet, M. and A. Sullivan, "Stringprep Revision and
Problem Statement for the Preparation and Comparison of Problem Statement for the Preparation and Comparison of
Internationalized Strings (PRECIS)", RFC 6885, March 2013. Internationalized Strings (PRECIS)", RFC 6885,
DOI 10.17487/RFC6885, March 2013,
[RFC6943] Thaler, D., "Issues in Identifier Comparison for Security <http://www.rfc-editor.org/info/rfc6885>.
Purposes", RFC 6943, May 2013.
[UAX9] The Unicode Consortium, "Unicode Standard Annex #9: [RFC6943] Thaler, D., Ed., "Issues in Identifier Comparison for
Unicode Bidirectional Algorithm", September 2012, Security Purposes", RFC 6943, DOI 10.17487/RFC6943, May
<http://unicode.org/reports/tr9/>. 2013, <http://www.rfc-editor.org/info/rfc6943>.
[UAX11] The Unicode Consortium, "Unicode Standard Annex #11: East [UAX11] Unicode Standard Annex #11, "East Asian Width", edited by
Asian Width", September 2012, Ken Lunde. An integral part of The Unicode Standard,
<http://unicode.org/reports/tr11/>. <http://unicode.org/reports/tr11/>.
[UAX15] The Unicode Consortium, "Unicode Standard Annex #15: [UAX15] Unicode Standard Annex #15, "Unicode Normalization Forms",
Unicode Normalization Forms", August 2012, edited by Mark Davis and Ken Whistler. An integral part of
<http://unicode.org/reports/tr15/>. The Unicode Standard, <http://unicode.org/reports/tr15/>.
[UnicodeCurrent] [UAX9] Unicode Standard Annex #9, "Unicode Bidirectional
The Unicode Consortium, "The Unicode Standard", Algorithm", edited by Mark Davis, Aharon Lanin, and Andrew
2014-present, <http://www.unicode.org/versions/latest/>. Glass. An integral part of The Unicode Standard,
<http://unicode.org/reports/tr9/>.
[UTR36] The Unicode Consortium, "Unicode Technical Report #36: [UTR36] Unicode Technical Report #36, "Unicode Security
Unicode Security Considerations", July 2012, Considerations", by Mark Davis and Michel Suignard,
<http://unicode.org/reports/tr36/>. <http://unicode.org/reports/tr36/>.
[UTS39] The Unicode Consortium, "Unicode Technical Standard #39: [UTS39] Unicode Technical Standard #39, "Unicode Security
Unicode Security Mechanisms", July 2012, Mechanisms", edited by Mark Davis and Michel Suignard,
<http://unicode.org/reports/tr39/>. <http://unicode.org/reports/tr39/>.
14.3. URIs [XMPP-Addr-Format]
Saint-Andre, P., "Extensible Messaging and Presence
[1] http://unicode.org/Public/UNIDATA/PropertyAliases.txt Protocol (XMPP): Address Format", Work in Progress,
draft-ietf-xmpp-6122bis-22, May 2015.
[2] http://unicode.org/Public/UNIDATA/DerivedCoreProperties.txt
Appendix A. Acknowledgements Acknowledgements
The authors would like to acknowledge the comments and contributions The authors would like to acknowledge the comments and contributions
of the following individuals during working group discussion: David of the following individuals during working group discussion: David
Black, Edward Burns, Dan Chiba, Mark Davis, Alan DeKok, Martin Black, Edward Burns, Dan Chiba, Mark Davis, Alan DeKok, Martin
Duerst, Patrik Faltstrom, Ted Hardie, Joe Hildebrand, Bjoern Duerst, Patrik Faltstrom, Ted Hardie, Joe Hildebrand, Bjoern
Hoehrmann, Paul Hoffman, Jeffrey Hutzelman, Simon Josefsson, John Hoehrmann, Paul Hoffman, Jeffrey Hutzelman, Simon Josefsson, John
Klensin, Alexey Melnikov, Takahiro Nemoto, Yoav Nir, Mike Parker, Klensin, Alexey Melnikov, Takahiro Nemoto, Yoav Nir, Mike Parker,
Pete Resnick, Andrew Sullivan, Dave Thaler, Yoshiro Yoneya, and Pete Resnick, Andrew Sullivan, Dave Thaler, Yoshiro Yoneya, and
Florian Zeitz. Florian Zeitz.
skipping to change at page 38, line 46 skipping to change at page 40, line 28
Charlie Kaufman, Tom Taylor, and Tim Wicinski reviewed the document Charlie Kaufman, Tom Taylor, and Tim Wicinski reviewed the document
on behalf of the Security Directorate, the General Area Review Team, on behalf of the Security Directorate, the General Area Review Team,
and the Operations and Management Directorate, respectively. and the Operations and Management Directorate, respectively.
During IESG review, Alissa Cooper, Stephen Farrell, and Barry Leiba During IESG review, Alissa Cooper, Stephen Farrell, and Barry Leiba
provided comments that led to further improvements. provided comments that led to further improvements.
Some algorithms and textual descriptions have been borrowed from Some algorithms and textual descriptions have been borrowed from
[RFC5892]. Some text regarding security has been borrowed from [RFC5892]. Some text regarding security has been borrowed from
[RFC5890], [I-D.ietf-precis-saslprepbis], and [RFC5890], [PRECIS-Users-Pwds], and [XMPP-Addr-Format].
[I-D.ietf-xmpp-6122bis].
Peter Saint-Andre wishes to acknowledge Cisco Systems, Inc., for Peter Saint-Andre wishes to acknowledge Cisco Systems, Inc., for
employing him during his work on earlier versions of this document. employing him during his work on earlier draft versions of this
document.
Authors' Addresses Authors' Addresses
Peter Saint-Andre Peter Saint-Andre
&yet &yet
Email: peter@andyet.com EMail: peter@andyet.com
URI: https://andyet.com/ URI: https://andyet.com/
Marc Blanchet Marc Blanchet
Viagenie Viagenie
246 Aberdeen 246 Aberdeen
Quebec, QC G1R 2E1 Quebec, QC G1R 2E1
Canada Canada
Email: Marc.Blanchet@viagenie.ca EMail: Marc.Blanchet@viagenie.ca
URI: http://www.viagenie.ca/ URI: http://www.viagenie.ca/
 End of changes. 131 change blocks. 
350 lines changed or deleted 383 lines changed or added

This html diff was produced by rfcdiff 1.42. The latest version is available from http://tools.ietf.org/tools/rfcdiff/