CR> Links to Charset Maps

CR> Links to Charset Maps

McDonald, Ira imcdonald at sharplabs.com
Wed Dec 11 19:05:48 EST 2002


Hi folks,

Per my action item from today's Character Repertoires
telecon, some useful links.

Cheers,
- Ira McDonald
  High North Inc

------------------------------------------------------

The best introduction and in-depth paper on UTF-8
and Unicode, from Markus Kuhn (a Unicode guru),
last updated 19 November 2002 (a living document):

http://www.cl.cam.ac.uk/~mgk25/unicode.html


IBM's open source ICU (I18N Components for Unicode) 
has a wonderful set of charset maps (they're in 
Unicode _to_ Legacy layout, which is the opposite of 
the ISO-8859 maps at the Unicode, which are in Legacy
_to_ Unicode layout).  The top-level URL:

http://www-124.ibm.com/cvs/icu/charset/data/


For official ISO and IETF language, country, and
script codes visit Michael Everson's page (Michael
is the IETF language tag reviewer and a heavy in
ISO language standards and the Unicode Consortium):

http://www.evertype.com/internationalization.html


The complete up-to-date Unicode Character Database:

http://www.unicode.org/ucd/


The Unicode Unihan Asian database:

http://www.unicode.org/Public/UNIDATA/Unihan.zip (5.0MB)

http://www.unicode.org/Public/UNIDATA/Unihan.txt (25.1 MB)


The Unicode Consortium maintained mapping tables:

http://www.unicode.org/Public/MAPPINGS/


And augmenting the mapping tables available from
the Unicode site is the CSets page:

http://crl.nmsu.edu/~mleisher/csets.html



More information about the Cr mailing list