PWG-ANNOUNCE> Character Repertoires Charter and Last Call

PWG-ANNOUNCE> Character Repertoires Charter and Last Call

Jun Fujisawa fujisawa.jun at canon.co.jp
Sat May 31 19:11:35 EDT 2003


Hello Elliott,

At 5:20 PM -0400 03.5.29, ElliottBradshaw at oaktech.com wrote:
>A Charter has been reviewed within the CR group and there are no open
>issues.
>
>It is available online at
>ftp://ftp.pwg.org/pub/pwg/cr/charter/ch-cr10-20030507.html.
>
>So today I begin a 10-day Last Call for comments on this document, prior to
>a formal vote by the PWG.

I feel a little uncomfortable with the following paragraph in the Charter.

>In Unicode and W3C specifications, the term "character set" usually
>refers to a method of encoding a (possibly very large) set of characters,
>e.g. UTF-8. This tells how to encode a given character if it is present,
>but doesn't define which characters in that space are actually in use.

In the Character Model for the World Wide Web specification, W3C
clearly deny the use of the term "character set" to refer to a method
of encoding.

<http://www.w3.org/TR/charmod/>

>[S] Specifications SHOULD avoid using the terms 'character set' and
>'charset' to refer to a character encoding, except when the latter is used
>to refer to the MIME charset parameter or its IANA-registered values.
>The terms 'character encoding', 'character encoding form' or 'character
>encoding scheme' are RECOMMENDED.

I suggest to change the wording to something like the following.

In Unicode and W3C specifications, the term "character set" usually
refers to a (possibly very large) set of characters, e.g. ISO/IEC 10646.
The term "character set", however, can be confusing in some cases,
since the similar term "charset" is used as a MIME parameter, which
refers to the combination of "coded character set" and "character
encoding scheme", not just the former.


--
Jun Fujisawa
<mailto:fujisawa.jun at canon.co.jp>



More information about the Pwg-announce mailing list