draft-ietf-precis-framework-00.txt   draft-ietf-precis-framework-01.txt 
Network Working Group M. Blanchet Network Working Group M. Blanchet
Internet-Draft Viagenie Internet-Draft Viagenie
Obsoletes: 3454 (if approved) P. Saint-Andre Obsoletes: 3454 (if approved) P. Saint-Andre
Intended status: Standards Track Cisco Intended status: Standards Track Cisco
Expires: February 23, 2012 August 22, 2011 Expires: May 2, 2012 October 30, 2011
PRECIS Framework: Handling Internationalized Strings in Protocols PRECIS Framework: Handling Internationalized Strings in Protocols
draft-ietf-precis-framework-00 draft-ietf-precis-framework-01
Abstract Abstract
Application protocols that make use of Unicode code points in Application protocols that make use of Unicode code points in
protocol strings need to prepare such strings in order to perform protocol strings need to prepare such strings in order to perform
comparison operations (e.g., for purposes of authentication or comparison operations (e.g., for purposes of authentication or
authorization). In general, this problem has been labeled the authorization). In general, this problem has been labeled the
"preparation and comparison of internationalized strings" or "preparation and comparison of internationalized strings" or
"PRECIS". This document defines a framework that enables application "PRECIS". This document defines a framework that enables application
protocols to prepare various classes of strings in a way that depends protocols to prepare various classes of strings in a way that depends
skipping to change at page 2, line 4 skipping to change at page 2, line 4
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on February 23, 2012. This Internet-Draft will expire on May 2, 2012.
Copyright Notice Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 11 skipping to change at page 3, line 11
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6
3. String Classes . . . . . . . . . . . . . . . . . . . . . . . . 6 3. String Classes . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1. NameClass . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1. NameClass . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.1. Valid . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1.1. Valid . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.2. Disallowed . . . . . . . . . . . . . . . . . . . . . . 8 3.1.2. Disallowed . . . . . . . . . . . . . . . . . . . . . . 8
3.1.3. Unassigned . . . . . . . . . . . . . . . . . . . . . . 8 3.1.3. Unassigned . . . . . . . . . . . . . . . . . . . . . . 8
3.1.4. Directionality . . . . . . . . . . . . . . . . . . . . 8 3.1.4. Directionality . . . . . . . . . . . . . . . . . . . . 8
3.1.5. Case Mapping . . . . . . . . . . . . . . . . . . . . . 8 3.1.5. Case Mapping . . . . . . . . . . . . . . . . . . . . . 8
3.1.6. Normalization . . . . . . . . . . . . . . . . . . . . 8 3.1.6. Normalization . . . . . . . . . . . . . . . . . . . . 8
3.2. SecretClass . . . . . . . . . . . . . . . . . . . . . . . 9 3.2. FreeClass . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.1. Valid . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2.1. Valid . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.2. Disallowed . . . . . . . . . . . . . . . . . . . . . . 9 3.2.2. Disallowed . . . . . . . . . . . . . . . . . . . . . . 9
3.2.3. Unassigned . . . . . . . . . . . . . . . . . . . . . . 9 3.2.3. Unassigned . . . . . . . . . . . . . . . . . . . . . . 9
3.2.4. Directionality . . . . . . . . . . . . . . . . . . . . 9 3.2.4. Directionality . . . . . . . . . . . . . . . . . . . . 9
3.2.5. Case Mapping . . . . . . . . . . . . . . . . . . . . . 9 3.2.5. Case Mapping . . . . . . . . . . . . . . . . . . . . . 9
3.2.6. Normalization . . . . . . . . . . . . . . . . . . . . 10 3.2.6. Normalization . . . . . . . . . . . . . . . . . . . . 10
3.3. FreeClass . . . . . . . . . . . . . . . . . . . . . . . . 10 4. Use of PRECIS String Classes . . . . . . . . . . . . . . . . . 10
3.3.1. Valid . . . . . . . . . . . . . . . . . . . . . . . . 10 4.1. Principles . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.2. Disallowed . . . . . . . . . . . . . . . . . . . . . . 10 4.2. Subclassing . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.3. Unassigned . . . . . . . . . . . . . . . . . . . . . . 10 4.3. Registration . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.4. Directionality . . . . . . . . . . . . . . . . . . . . 10 5. Code Point Properties . . . . . . . . . . . . . . . . . . . . 11
3.3.5. Case Mapping . . . . . . . . . . . . . . . . . . . . . 11
3.3.6. Normalization . . . . . . . . . . . . . . . . . . . . 11
4. Use of PRECIS String Classes . . . . . . . . . . . . . . . . . 11
4.1. Principles . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2. Subclassing . . . . . . . . . . . . . . . . . . . . . . . 11
4.3. Registration . . . . . . . . . . . . . . . . . . . . . . . 12
5. Code Point Properties . . . . . . . . . . . . . . . . . . . . 12
6. Category Definitions Used to Calculate Derived Property 6. Category Definitions Used to Calculate Derived Property
Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.1. LetterDigits (A) . . . . . . . . . . . . . . . . . . . . . 14 6.1. LetterDigits (A) . . . . . . . . . . . . . . . . . . . . . 13
6.2. Unstable (B) . . . . . . . . . . . . . . . . . . . . . . . 15 6.2. Unstable (B) . . . . . . . . . . . . . . . . . . . . . . . 13
6.3. IgnorableProperties (C) . . . . . . . . . . . . . . . . . 15 6.3. IgnorableProperties (C) . . . . . . . . . . . . . . . . . 14
6.4. IgnorableBlocks (D) . . . . . . . . . . . . . . . . . . . 15 6.4. IgnorableBlocks (D) . . . . . . . . . . . . . . . . . . . 14
6.5. LDH (E) . . . . . . . . . . . . . . . . . . . . . . . . . 15 6.5. LDH (E) . . . . . . . . . . . . . . . . . . . . . . . . . 14
6.6. Exceptions (F) . . . . . . . . . . . . . . . . . . . . . . 15 6.6. Exceptions (F) . . . . . . . . . . . . . . . . . . . . . . 14
6.7. BackwardCompatible (G) . . . . . . . . . . . . . . . . . . 16 6.7. BackwardCompatible (G) . . . . . . . . . . . . . . . . . . 15
6.8. JoinControl (H) . . . . . . . . . . . . . . . . . . . . . 17 6.8. JoinControl (H) . . . . . . . . . . . . . . . . . . . . . 16
6.9. OldHangulJamo (I) . . . . . . . . . . . . . . . . . . . . 17 6.9. OldHangulJamo (I) . . . . . . . . . . . . . . . . . . . . 16
6.10. Unassigned (J) . . . . . . . . . . . . . . . . . . . . . . 17 6.10. Unassigned (J) . . . . . . . . . . . . . . . . . . . . . . 16
6.11. ASCII7 (K) . . . . . . . . . . . . . . . . . . . . . . . . 18 6.11. ASCII7 (K) . . . . . . . . . . . . . . . . . . . . . . . . 17
6.12. Controls (L) . . . . . . . . . . . . . . . . . . . . . . . 18 6.12. Controls (L) . . . . . . . . . . . . . . . . . . . . . . . 17
6.13. PrecisIgnorableProperties (M) . . . . . . . . . . . . . . 18 6.13. PrecisIgnorableProperties (M) . . . . . . . . . . . . . . 17
6.14. Spaces (N) . . . . . . . . . . . . . . . . . . . . . . . . 18 6.14. Spaces (N) . . . . . . . . . . . . . . . . . . . . . . . . 17
6.15. Symbols (O) . . . . . . . . . . . . . . . . . . . . . . . 19 6.15. Symbols (O) . . . . . . . . . . . . . . . . . . . . . . . 17
6.16. Punctuation (P) . . . . . . . . . . . . . . . . . . . . . 19 6.16. Punctuation (P) . . . . . . . . . . . . . . . . . . . . . 18
6.17. HasCompat (Q) . . . . . . . . . . . . . . . . . . . . . . 19 6.17. HasCompat (Q) . . . . . . . . . . . . . . . . . . . . . . 18
7. Calculation of the Derived Property . . . . . . . . . . . . . 19 7. Calculation of the Derived Property . . . . . . . . . . . . . 18
8. Code Points . . . . . . . . . . . . . . . . . . . . . . . . . 20 8. Code Points . . . . . . . . . . . . . . . . . . . . . . . . . 19
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19
9.1. PRECIS Derived Property Value Registry . . . . . . . . . . 20 9.1. PRECIS Derived Property Value Registry . . . . . . . . . . 19
9.2. PRECIS Usage Registry . . . . . . . . . . . . . . . . . . 21 9.2. PRECIS Usage Registry . . . . . . . . . . . . . . . . . . 20
10. Security Considerations . . . . . . . . . . . . . . . . . . . 21
10.1. General Issues . . . . . . . . . . . . . . . . . . . . . . 21 10. Security Considerations . . . . . . . . . . . . . . . . . . . 20
10.2. Local Character Set Issues . . . . . . . . . . . . . . . . 22 10.1. General Issues . . . . . . . . . . . . . . . . . . . . . . 20
10.3. Visually Similar Characters . . . . . . . . . . . . . . . 22 10.2. Local Character Set Issues . . . . . . . . . . . . . . . . 21
10.4. Security of the SecretClass . . . . . . . . . . . . . . . 24 10.3. Visually Similar Characters . . . . . . . . . . . . . . . 21
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 24 10.4. Security of Passwords and Passphrases . . . . . . . . . . 23
12. Codepoints 0x0000 - 0x10FFFF . . . . . . . . . . . . . . . . . 25 11. Interoperability Considerations . . . . . . . . . . . . . . . 23
12.1. Codepoints in Unicode Character Database (UCD) format . . 25 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 24
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25 13. Codepoints 0x0000 - 0x10FFFF . . . . . . . . . . . . . . . . . 24
13.1. Normative References . . . . . . . . . . . . . . . . . . . 25 13.1. Codepoints in Unicode Character Database (UCD) format . . 24
13.2. Informative References . . . . . . . . . . . . . . . . . . 25 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27 14.1. Normative References . . . . . . . . . . . . . . . . . . . 24
14.2. Informative References . . . . . . . . . . . . . . . . . . 24
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 26
1. Introduction 1. Introduction
A number of IETF application technologies use stringprep [RFC3454] as A number of IETF application technologies use stringprep [RFC3454] as
the basis for comparing protocol strings that contain Unicode the basis for comparing protocol strings that contain Unicode
characters or "code points" [UNICODE]. Since the publication of characters or "code points" [UNICODE]. Since the publication of
[RFC3454] in 2002, the Internet community has gained much more [RFC3454] in 2002, the Internet community has gained much more
experience with internationalization, some of it reflected in experience with internationalization, some of it reflected in
[RFC4690]. In particular, the IETF's technology for [RFC4690]. In particular, the IETF's technology for
internationalized domain names (IDNs) has changed significantly: internationalized domain names (IDNs) has changed significantly:
skipping to change at page 5, line 28 skipping to change at page 5, line 28
"customers" of stringprep to consider new approaches to the "customers" of stringprep to consider new approaches to the
preparation and comparison of internationalized strings ("PRECIS"), preparation and comparison of internationalized strings ("PRECIS"),
as described in [PROBLEM]. as described in [PROBLEM].
This document proposes a technical framework for a post-stringprep This document proposes a technical framework for a post-stringprep
approach to the preparation and comparison of internationalized approach to the preparation and comparison of internationalized
strings in application protocols. The framework is based on several strings in application protocols. The framework is based on several
principles: principles:
1. Define a small set of base string classes appropriate for common 1. Define a small set of base string classes appropriate for common
application protocol constructs such as usernames, passwords, and application protocol constructs such as usernames and free-form
free-form identifiers. strings.
2. Define each base string class in terms of Unicode code points and 2. Define each base string class in terms of Unicode code points and
their properties, specifying whether each code point or character their properties, specifying whether each code point or character
category is valid, disallowed, or unassigned. category is valid, disallowed, or unassigned.
3. Enable application protocols to subclass the base string classes, 3. Enable application protocols to subclass the base string classes,
mainly to disallow particular code points that are currently mainly to disallow particular code points that are currently
disallowed in the relevant application protocol (e.g., characters disallowed in the relevant application protocol (e.g., characters
with special or reserved meaning, such as "@" and "/" when used with special or reserved meaning, such as "@" and "/" when used
as separators within identifiers). as separators within identifiers).
skipping to change at page 6, line 19 skipping to change at page 6, line 19
application protocols will be more predictable and coherent. application protocols will be more predictable and coherent.
Although this framework is similar to IDNA2008 and borrows some of Although this framework is similar to IDNA2008 and borrows some of
the character categories defined in [RFC5892], it defines additional the character categories defined in [RFC5892], it defines additional
string classes and character categories to meet the needs of common string classes and character categories to meet the needs of common
application protocols. application protocols.
2. Terminology 2. Terminology
Many important terms used in this document are defined in [PROBLEM], Many important terms used in this document are defined in [PROBLEM],
[I18N-TERMS], [RFC5890], and [UNICODE]. [RFC6365], [RFC5890], and [UNICODE].
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in
[RFC2119]. [RFC2119].
3. String Classes 3. String Classes
IDNA2008 essentially defines a base string class of internationalized IDNA2008 essentially defines a base string class of internationalized
domain name, although it does not use the term "string class". (This domain name, although it does not use the term "string class". (This
skipping to change at page 6, line 42 skipping to change at page 6, line 42
appropriate method to prepare domain names and hostnames.) appropriate method to prepare domain names and hostnames.)
We propose the following additional base string classes for use in We propose the following additional base string classes for use in
application protocols: application protocols:
NameClass: a sequence of letters, numbers, and symbols that is used NameClass: a sequence of letters, numbers, and symbols that is used
to identify or address a network entity such as a user, an to identify or address a network entity such as a user, an
account, a venue (e.g., a chatroom), an information source (e.g., account, a venue (e.g., a chatroom), an information source (e.g.,
a data feed), or a collection of data (e.g., a file). a data feed), or a collection of data (e.g., a file).
SecretClass: a sequence of letters, numbers, and symbols that is
used as a secret for access to some resource on a network (e.g., a
password or passphrase).
FreeClass: a sequence of letters, numbers, symbols, spaces, and FreeClass: a sequence of letters, numbers, symbols, spaces, and
other code points that is used for more expressive purposes in an other code points that is used for free-form strings, including
application protocol (e.g., a free-form identifier such as a passwords and passphrases as well as display elements such as a
human-friendly nickname in a chatroom). human-friendly nickname in a chatroom.
Note: [PROBLEM] mentions a class of "string blobs" containing Note: [PROBLEM] mentions a class of "string blobs" containing
"elements of the protocol that look like strings to users, but that "elements of the protocol that look like strings to users, but that
are passed around in the protocol unchanged and that cannot be used are passed around in the protocol unchanged and that cannot be used
for comparison or other purposes." It is an open question whether for comparison or other purposes." It is an open question whether
application protocols need to apply preparation and comparison rules application protocols need to apply preparation and comparison rules
to such strings. to such strings.
The following subsections discuss these string classes in more The following subsections discuss these string classes in more
detail, with reference to the dimensions described in Section 3 of detail, with reference to the dimensions described in Section 3 of
skipping to change at page 7, line 47 skipping to change at page 7, line 42
This document defines the valid, disallowed, and unassigned rules. This document defines the valid, disallowed, and unassigned rules.
Application protocols that use the PRECIS string classes MUST define Application protocols that use the PRECIS string classes MUST define
the directionality, casemapping, and normalization rules, as further the directionality, casemapping, and normalization rules, as further
described under Section 9.2. described under Section 9.2.
3.1. NameClass 3.1. NameClass
Most application technologies need a special class of strings that Most application technologies need a special class of strings that
can be used to refer to, include, or communicate things like can be used to refer to, include, or communicate things like
usernames, chatroom names, file names, and data feed names. We group usernames, file names, data feed names, and chatroom names. We group
such things into a bucket called "NameClass" having the following such things into a bucket called "NameClass" having the following
features. features.
3.1.1. Valid 3.1.1. Valid
o Letters and numbers, i.e., the LetterDigits ("A") category first o Letters and numbers, i.e., the LetterDigits ("A") category first
defined in [RFC5892] and listed here under Section 6.1. defined in [RFC5892] and listed here under Section 6.1.
o Code points in the range U+0021 through U+007E, i.e., the ASCII7 o Code points in the range U+0021 through U+007E, i.e., the ASCII7
("K") rule defined under Section 6.11. These code points are ("K") rule defined under Section 6.11. These code points are
valid even if they would otherwise be disallowed according to the valid even if they would otherwise be disallowed according to the
property-based rules specified in the next section. property-based rules specified in the next section.
3.1.2. Disallowed 3.1.2. Disallowed
o Control characters, i.e., the Controls ("L") category defined o Control characters, i.e., the Controls ("L") category defined
under Section 6.12. under Section 6.12.
o Space characters, i.e., the Spaces ("N") category defined under o Space characters, i.e., the Spaces ("N") category defined under
skipping to change at page 9, line 5 skipping to change at page 8, line 49
that uses or subclasses the NameClass. that uses or subclasses the NameClass.
3.1.6. Normalization 3.1.6. Normalization
The normalization form MUST be specified by each application protocol The normalization form MUST be specified by each application protocol
that uses or subclasses the NameClass. that uses or subclasses the NameClass.
However, in accordance with [RFC5198], normalization form C (NFC) is However, in accordance with [RFC5198], normalization form C (NFC) is
RECOMMENDED. RECOMMENDED.
3.2. SecretClass 3.2. FreeClass
Many application technologies need a special class of strings that Some application technologies need a special class of strings that
can be used to communicate secrets of the kind that are typically can be used in a free-form way (e.g., as a passphrase or a nickname
used as passwords or passphrases. We group such things into a bucket in a chatroom). We group such things into a bucket called
called "SecretClass" having the following features. "FreeClass" having the following features.
NOTE: Consult Section 10.4 for relevant security considerations. NOTE: Consult Section 10.4 for relevant security considerations when
strings conforming to the FreeClass, or a subclass thereof, are used
as passwords or passphrases.
3.2.1. Valid 3.2.1. Valid
o Letters and numbers, i.e., the LetterDigits ("A") category first o Letters and numbers, i.e., the LetterDigits ("A") category first
defined in [RFC5892] and listed here under Section 6.1. defined in [RFC5892] and listed here under Section 6.1.
o Code points in the range U+0021 through U+007E, i.e., the ASCII7 o Code points in the range U+0021 through U+007E, i.e., the ASCII7
("K") rule defined under Section 6.11. These code points are
valid even if they would otherwise be disallowed according to the
property-based rules specified in the next section.
o Any character that has a compatibility equivalent, i.e., the
HasCompat ("Q") category defined under Section 6.17.
o Symbol characters, i.e., the Symbols ("O") category defined under
Section 6.15.
o Punctuation characters, i.e., the Punctuation ("P") category
defined under Section 6.16.
3.2.2. Disallowed
o Control characters, i.e., the Controls ("L") category defined
under Section 6.12.
o Space characters, i.e., the Spaces ("N") category defined under
Section 6.14.
3.2.3. Unassigned
Any code points that are not yet assigned in the Unicode character
set SHALL be considered Unassigned for purposes of the SecretClass.
3.2.4. Directionality
The directionality rule MUST be specified by each application
protocol that uses or subclasses the SecretClass.
3.2.5. Case Mapping
The casemapping rule MUST be specified by each application protocol
that uses or subclasses the SecretClass.
However, in order to maximize the entropy of passwords and
passphrases, it is NOT RECOMMENDED for application protocols to map
uppercase and titlecase code points to their lowercase equivalents;
instead, it is RECOMMENDED to preserve the case of all code points
contained in string that conform to or subclass the SecretClass.
3.2.6. Normalization
The normalization form MUST be specified by each application protocol
that uses or subclasses the SecretClass.
However, in accordance with [RFC5198], normalization form C (NFC) is
RECOMMENDED.
3.3. FreeClass
Some application technologies need a special class of strings that
can be used in a free-form way (e.g., a nickname in a chatroom). We
group such things into a bucket called "FreeClass" having the
following features.
3.3.1. Valid
o Letters and numbers, i.e., the LetterDigits ("A") category first
defined in [RFC5892] and listed here under Section 6.1.
o Code points in the range U+0021 through U+007E, i.e., the ASCII7
("K") rule defined under Section 6.11. ("K") rule defined under Section 6.11.
o Any character that has a compatibility equivalent, i.e., the o Any character that has a compatibility equivalent, i.e., the
HasCompat ("Q") category defined under Section 6.17. HasCompat ("Q") category defined under Section 6.17.
o Space characters, i.e., the Spaces ("N") category defined under o Space characters, i.e., the Spaces ("N") category defined under
Section 6.14. Section 6.14.
o Symbol characters, i.e., the Symbols ("O") category defined under o Symbol characters, i.e., the Symbols ("O") category defined under
Section 6.15. Section 6.15.
o Punctuation characters, i.e., the Punctuation ("P") category o Punctuation characters, i.e., the Punctuation ("P") category
defined under Section 6.16. defined under Section 6.16.
3.3.2. Disallowed 3.2.2. Disallowed
o Control characters, i.e., the Controls ("L") category defined o Control characters, i.e., the Controls ("L") category defined
under Section 6.12. under Section 6.12.
3.3.3. Unassigned 3.2.3. Unassigned
Any code points that are not yet assigned in the Unicode character Any code points that are not yet assigned in the Unicode character
set SHALL be considered Unassigned for purposes of the FreeClass. set SHALL be considered Unassigned for purposes of the FreeClass.
3.3.4. Directionality 3.2.4. Directionality
The directionality rule MUST be specified by each application The directionality rule MUST be specified by each application
protocol that uses or subclasses the FreeClass. protocol that uses or subclasses the FreeClass.
3.3.5. Case Mapping 3.2.5. Case Mapping
The casemapping rule MUST be specified by each application protocol The casemapping rule MUST be specified by each application protocol
that uses or subclasses the FreeClass. that uses or subclasses the FreeClass.
3.3.6. Normalization In order to maximize entropy, it is NOT RECOMMENDED for application
protocols to map uppercase and titlecase code points to their
lowercase equivalents when strings conforming to the FreeClass, or a
subclass thereof, are used in passwords or passphrases; instead, it
is RECOMMENDED to preserve the case of all code points contained in
such strings.
3.2.6. Normalization
The normalization form MUST be specified by each application protocol The normalization form MUST be specified by each application protocol
that uses or subclasses the FreeClass. that uses or subclasses the FreeClass.
However, in accordance with [RFC5198], normalization form C (NFC) is However, in accordance with [RFC5198], normalization form C (NFC) is
RECOMMENDED. RECOMMENDED.
4. Use of PRECIS String Classes 4. Use of PRECIS String Classes
4.1. Principles 4.1. Principles
skipping to change at page 12, line 32 skipping to change at page 11, line 26
relevant application protocol. relevant application protocol.
This document is not intended to specify precisely how derived This document is not intended to specify precisely how derived
property values are to be applied in protocol strings. That property values are to be applied in protocol strings. That
information should be defined in the protocol specification that uses information should be defined in the protocol specification that uses
or subclasses a base string class from this document. or subclasses a base string class from this document.
The value of the property is to be interpreted as follows. The value of the property is to be interpreted as follows.
PROTOCOL VALID Those code points that are allowed to be used in any PROTOCOL VALID Those code points that are allowed to be used in any
PRECIS string class (NameClass, SecretClass, and FreeClass). Code PRECIS string class (NameClass and FreeClass). Code points with
points with this property value are permitted for general use in this property value are permitted for general use in any string
any string class. The abbreviated term PVALID is used to refer to class. The abbreviated term PVALID is used to refer to this value
this value in the remainder of this document. in the remainder of this document.
SPECIFIC CLASS PROTOCOL VALID Those code points that are allowed to SPECIFIC CLASS PROTOCOL VALID Those code points that are allowed to
be used in specific string classes. Code points with this be used in specific string classes. Code points with this
property value are permitted for use in specific string classes. property value are permitted for use in specific string classes.
In the remainder of this document, the abbreviated term *_PVALID In the remainder of this document, the abbreviated term *_PVALID
is used, where * = (NAMECLASS | SECRETCLASS | FREECLASS). is used, where * = (NAMECLASS | SECRETCLASS | FREECLASS).
CONTEXTUAL RULE REQUIRED Some characteristics of the character, such CONTEXTUAL RULE REQUIRED Some characteristics of the character, such
as its being invisible in certain contexts or problematic in as its being invisible in certain contexts or problematic in
others, require that it not be used in labels unless specific others, require that it not be used in labels unless specific
skipping to change at page 20, line 37 skipping to change at page 19, line 30
or SECRETCLASS_DISALLOWED or SECRETCLASS_DISALLOWED
or FREECLASS_VALID; or FREECLASS_VALID;
Else If .cp. .in. HasCompat Then NAMECLASS_DISALLOWED Else If .cp. .in. HasCompat Then NAMECLASS_DISALLOWED
or SECRETCLASS_VALID or SECRETCLASS_VALID
or FREECLASS_VALID; or FREECLASS_VALID;
Else DISALLOWED; Else DISALLOWED;
8. Code Points 8. Code Points
The Categories and Rules defined in Section 6 and Section 7 apply to The Categories and Rules defined in Section 6 and Section 7 apply to
all Unicode code points. The table in Section 12 shows, for all Unicode code points. The table in Section 13 shows, for
illustrative purposes, the consequences of the categories and illustrative purposes, the consequences of the categories and
classification rules, and the resulting property values. classification rules, and the resulting property values.
The list of code points that can be found in Section 12 is non- The list of code points that can be found in Section 13 is non-
normative. Instead, the rules defined by Section 6 and Section 7 are normative. Instead, the rules defined by Section 6 and Section 7 are
normative, and any tables are derived from the rules. normative, and any tables are derived from the rules.
9. IANA Considerations 9. IANA Considerations
9.1. PRECIS Derived Property Value Registry 9.1. PRECIS Derived Property Value Registry
IANA is requested to create a PRECIS-specific registry with the IANA is requested to create a PRECIS-specific registry with the
Derived Properties for the versions of Unicode that are released Derived Properties for the versions of Unicode that are released
after (and including) version 6.0. The derived property value is to after (and including) version 6.0. The derived property value is to
be calculated in cooperation with a designated expert [RFC5226] be calculated in cooperation with a designated expert [RFC5226]
according to the specifications in Section 6 and Section 7, and not according to the specifications in Section 6 and Section 7, and not
by copying the non-normative table found in Section 12. by copying the non-normative table found in Section 13.
If during this process (creation of the table of derived property If during this process (creation of the table of derived property
values) followed by a designated expert review, either backward- values) followed by a designated expert review, either backward-
incompatible changes to the table of derived properties are incompatible changes to the table of derived properties are
discovered, or otherwise problems during the creation of the table discovered, or otherwise problems during the creation of the table
arises, that is to be flagged to the IESG. Changes to the rules (as arises, that is to be flagged to the IESG. Changes to the rules (as
specified in Section 6 and Section 7) require IETF Review, as specified in Section 6 and Section 7) require IETF Review, as
described in [RFC5226]. described in [RFC5226].
9.2. PRECIS Usage Registry 9.2. PRECIS Usage Registry
skipping to change at page 24, line 13 skipping to change at page 23, line 8
because and most scripts are typically contained in one or more because and most scripts are typically contained in one or more
blocks of characters, the software SHOULD warn the user when blocks of characters, the software SHOULD warn the user when
presenting a string that mixes characters from more than one presenting a string that mixes characters from more than one
script or block, or that uses characters outside the normal range script or block, or that uses characters outside the normal range
of the user's preferred language(s). (Such a recommendation is of the user's preferred language(s). (Such a recommendation is
not intended to discourage communication across different not intended to discourage communication across different
communities of language users; instead, it recognizes the communities of language users; instead, it recognizes the
existence of such communities and encourages due caution when existence of such communities and encourages due caution when
presenting unfamiliar scripts or characters to human users.) presenting unfamiliar scripts or characters to human users.)
10.4. Security of the SecretClass 10.4. Security of Passwords and Passphrases
One goal of passwords and passphrases is to maximize the amount of One goal of passwords and passphrases is to maximize the amount of
entropy, for example by allowing a wide range of code points and by entropy, for example by allowing a wide range of code points and by
ensuring that secrets are not prepared in such a way that code points ensuring that secrets are not prepared in such a way that code points
are compared aggressively. Therefore, it is NOT RECOMMENDED for are compared aggressively. Therefore, it is NOT RECOMMENDED for
application protocols to subclass the SecretClass in a way that application protocols to subclass the FreeClass for use in passwords
removes entire categories (e.g., by disallowing symbols or and passphrases in a way that removes entire categories (e.g., by
punctuation). Furthermore, it is NOT RECOMMENDED for application disallowing symbols or punctuation). Furthermore, it is NOT
protocols to map uppercase and titlecase code points to their RECOMMENDED for application protocols to map uppercase and titlecase
lowercase equivalents; instead, it is RECOMMENDED to preserve the code points to their lowercase equivalents in such strings; instead,
case of all code points contained in string that conform to or it is RECOMMENDED to preserve the case of all code points contained
subclass the SecretClass. in such strings.
That said, software implementers need to be aware that there exist That said, software implementers need to be aware that there exist
tradeoffs between entropy and usability. For example, allowing a tradeoffs between entropy and usability. For example, allowing a
user to establish a password containing "uncommon" code points might user to establish a password containing "uncommon" code points might
make it difficult for the user to access an application when using an make it difficult for the user to access an application when using an
unfamiliar or constrained input device. unfamiliar or constrained input device.
Some application protocols use passwords and passphrases directly, Some application protocols use passwords and passphrases directly,
whereas others reuse technologies that themselves process passwords whereas others reuse technologies that themselves process passwords
(one example is the Simple Authentication and Security Layer (one example is the Simple Authentication and Security Layer
[RFC4422]). Moreover, passwords are often carried by a sequence of [RFC4422]). Moreover, passwords are often carried by a sequence of
protocols with backends authentication systems or data storage protocols with backends authentication systems or data storage
systems such as RADIUS [RFC2865] and LDAP [RFC4510]. Developers of systems such as RADIUS [RFC2865] and LDAP [RFC4510]. Developers of
application protocols are encouraged to look into reusing these application protocols are encouraged to look into reusing these
profiles instead of defining new ones, so that end-user expectations profiles instead of defining new ones, so that end-user expectations
about passwords are consistent no matter which application protocol about passwords are consistent no matter which application protocol
is used. is used.
11. Acknowledgements 11. Interoperability Considerations
Although strings that are consumed in PRECIS-based application
protocols are often encoded using UTF-8 [RFC3629], the exact encoding
is a matter for the using protocol, not the PRECIS framework.
It is known that some existing systems are unable to support the full
Unicode character set, or even any characters outside the US-ASCII
range. If two (or more) applications need to interoperate when
exchanging data (e.g., for the purpose of authenticating a username
or password), they will naturally need have in common at least one
coded character set (as defined by [RFC6365]). Establishing such a
baseline is a matter for the using protocol, not the PRECIS
framework.
12. Acknowledgements
The authors would like to acknowledge the comments and contributions The authors would like to acknowledge the comments and contributions
of the following individuals: David Black, Mark Davis, Alan DeKok, of the following individuals: David Black, Mark Davis, Alan DeKok,
Martin Duerst, Patrik Faltstrom, Ted Hardie, Joe Hildebrand, Paul Martin Duerst, Patrik Faltstrom, Ted Hardie, Joe Hildebrand, Paul
Hoffman, Jeffrey Hutzelman, Simon Josefsson, John Klensin, Alexey Hoffman, Jeffrey Hutzelman, Simon Josefsson, John Klensin, Alexey
Melnikov, Pete Resnick, Andrew Sullivan, and Dave Thaler. Melnikov, Yoav Nir, Mike Parker, Pete Resnick, Andrew Sullivan, Dave
Thaler, and Yoshiro Yoneya.
Some algorithms and textual descriptions have been borrowed from Some algorithms and textual descriptions have been borrowed from
[RFC5892]. Some text regarding security has been borrowed from [RFC5892]. Some text regarding security has been borrowed from
[RFC5890] and [XMPP-ADDR]. [RFC5890] and [XMPP-ADDR].
12. Codepoints 0x0000 - 0x10FFFF 13. Codepoints 0x0000 - 0x10FFFF
To follow. To follow.
12.1. Codepoints in Unicode Character Database (UCD) format 13.1. Codepoints in Unicode Character Database (UCD) format
To follow. To follow.
13. References 14. References
13.1. Normative References 14.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network
Interchange", RFC 5198, March 2008. Interchange", RFC 5198, March 2008.
[UNICODE] The Unicode Consortium, "The Unicode Standard, Version [UNICODE] The Unicode Consortium, "The Unicode Standard, Version
6.0", 2010, 6.0", 2010,
<http://www.unicode.org/versions/Unicode6.0.0/>. <http://www.unicode.org/versions/Unicode6.0.0/>.
13.2. Informative References 14.2. Informative References
[I18N-TERMS]
Hoffman, P. and J. Klensin, "Terminology Used in
Internationalization in the IETF",
draft-ietf-appsawg-rfc3536bis-06 (work in progress),
July 2011.
[IDENTIFIER] [IDENTIFIER]
Thaler, D., "Issues in Identifier Comparison for Security Thaler, D., "Issues in Identifier Comparison for Security
Purposes", draft-iab-identifier-comparison-00 (work in Purposes", draft-iab-identifier-comparison-00 (work in
progress), July 2011. progress), July 2011.
[PROBLEM] Blanchet, M. and A. Sullivan, "Stringprep Revision Problem [PROBLEM] Blanchet, M. and A. Sullivan, "Stringprep Revision Problem
Statement", draft-ietf-precis-problem-statement-03 (work Statement", draft-ietf-precis-problem-statement-03 (work
in progress), July 2011. in progress), July 2011.
skipping to change at page 26, line 13 skipping to change at page 25, line 21
RFC 2865, June 2000. RFC 2865, June 2000.
[RFC3454] Hoffman, P. and M. Blanchet, "Preparation of [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of
Internationalized Strings ("stringprep")", RFC 3454, Internationalized Strings ("stringprep")", RFC 3454,
December 2002. December 2002.
[RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello,
"Internationalizing Domain Names in Applications (IDNA)", "Internationalizing Domain Names in Applications (IDNA)",
RFC 3490, March 2003. RFC 3490, March 2003.
[RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
10646", STD 63, RFC 3629, November 2003.
[RFC4422] Melnikov, A. and K. Zeilenga, "Simple Authentication and [RFC4422] Melnikov, A. and K. Zeilenga, "Simple Authentication and
Security Layer (SASL)", RFC 4422, June 2006. Security Layer (SASL)", RFC 4422, June 2006.
[RFC4510] Zeilenga, K., "Lightweight Directory Access Protocol [RFC4510] Zeilenga, K., "Lightweight Directory Access Protocol
(LDAP): Technical Specification Road Map", RFC 4510, (LDAP): Technical Specification Road Map", RFC 4510,
June 2006. June 2006.
[RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and [RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and
Recommendations for Internationalized Domain Names Recommendations for Internationalized Domain Names
(IDNs)", RFC 4690, September 2006. (IDNs)", RFC 4690, September 2006.
skipping to change at page 27, line 5 skipping to change at page 26, line 17
RFC 5893, August 2010. RFC 5893, August 2010.
[RFC5894] Klensin, J., "Internationalized Domain Names for [RFC5894] Klensin, J., "Internationalized Domain Names for
Applications (IDNA): Background, Explanation, and Applications (IDNA): Background, Explanation, and
Rationale", RFC 5894, August 2010. Rationale", RFC 5894, August 2010.
[RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for [RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for
Internationalized Domain Names in Applications (IDNA) Internationalized Domain Names in Applications (IDNA)
2008", RFC 5895, September 2010. 2008", RFC 5895, September 2010.
[RFC6365] Hoffman, P. and J. Klensin, "Terminology Used in
Internationalization in the IETF", BCP 166, RFC 6365,
September 2011.
[UAX15] The Unicode Consortium, "Unicode Standard Annex #15: [UAX15] The Unicode Consortium, "Unicode Standard Annex #15:
Unicode Normalization Forms", September 2010, Unicode Normalization Forms", September 2010,
<http://unicode.org/reports/tr15/>. <http://unicode.org/reports/tr15/>.
[UAX9] The Unicode Consortium, "Unicode Standard Annex #9: [UAX9] The Unicode Consortium, "Unicode Standard Annex #9:
Unicode Bidirectional Algorithm", September 2010, Unicode Bidirectional Algorithm", September 2010,
<http://unicode.org/reports/tr9/>. <http://unicode.org/reports/tr9/>.
[UTR36] The Unicode Consortium, "Unicode Technical Report #36: [UTR36] The Unicode Consortium, "Unicode Technical Report #36:
Unicode Security Considerations", August 2010, Unicode Security Considerations", August 2010,
 End of changes. 37 change blocks. 
160 lines changed or deleted 120 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/