CR> FW: GB 18030 Information Required

CR> FW: GB 18030 Information Required

McDonald, Ira imcdonald at sharplabs.com
Mon Mar 3 16:38:58 EST 2003


Hi Elliot,

Yes - GB18030 is a mapping to EVERY codepoint in Unicode (not just
the assigned ones, but all 1.1 million possible Unicode codepoints).
But it's a multi-byte, variable-length (one to four bytes) set of
codepoints in GB18030.

As Markus Scherer says it is best thought of as a Chinese-market
UTF (Unicode Transformation Format), like UTF-8, UTF-16, and UTF-32.

I agree with you therefore, that PWG CR should view GB18030 as a
valid 'charset' (which can be tagged) but NOT as a unique
'repertoire' (because it's a different encoding of Unicode).

Cheers,
- Ira McDonald
  High North Inc


-----Original Message-----
From: ElliottBradshaw at oaktech.com [mailto:ElliottBradshaw at oaktech.com]
Sent: Monday, March 03, 2003 11:32 AM
To: McDonald, Ira
Cc: 'cr at pwg.org'; owner-cr at pwg.org
Subject: Re: CR> FW: GB 18030 Information Required



Interesting.

If I read this correctly, then 18030 is a mapping to ALL of Unicode.  This
would make it an encoding, but not a subset.

If that's right, then we would treat it as a kind of charset, but not as a
repertoire.

Your thoughts?

  E.


------------------------------------------
Elliott Bradshaw
Director, Software Engineering
Oak Technology Imaging Group
781 638-7534



 

                    "McDonald, Ira"

                    <imcdonald at shar       To:     "'cr at pwg.org'"
<cr at pwg.org>                   
                    plabs.com>            cc:

                    Sent by:              Subject:     CR> FW: GB 18030
Information Required    
                    owner-cr at pwg.or

                    g

 

 

                    03/03/2003

                    11:42 AM

 

 





Hi folks,

Elliot - the first two white papers (links below) look highly
useful.  Markus Scherer is a Unicode and charsets heavy at IBM.

Cheers,
- Ira McDonald
  High North Inc


-----Original Message-----
From: Markus Scherer [mailto:markus.scherer at jtcsv.com]
Sent: Monday, March 03, 2003 10:26 AM
To: vinay.aggarwal at rebus.co.in; charsets
Subject: Re: GB 18030 Information Required


vinay.aggarwal at rebus.co.in wrote:
> Could you please let me know if following  supports the GB18030?
> - Any web based application
> - Browser (Internet Explorer/ Netsacpe) based application

Yes and no. Generally, web-based applications and browsers and related
protocols do support GB 18030
and Unicode and various other charsets.

Specifically, you need to read about
- charsets, e.g.,
http://oss.software.ibm.com/icu/docs/papers/codepages_and_unicode.html
- GB 18030, e.g., http://oss.software.ibm.com/icu/docs/papers/gb18030.html
- Unicode, e.g., http://www.unicode.org/standard/WhatIsUnicode.html

and about the particular applications (and versions of them) that you
intend
to use.

markus






More information about the Cr mailing list