IPP> ASIAN languages in IPP.

Mon Nov 3 14:59:09 EST 1997

Dear IPP members,

 I'm still concerning what language should be used when the text attributes
becomes mixed language, such as :"%%[PrinterError:Offending command while
printing file ******.ps%%]" (please assume ***** as a Japanese kanji anything
you like such as "TEMPURA" "FUJIYAMA" or "GEISHA"). Should it be English,
Japanese, or we don't have to care??

 I have another concern to use Unicode in multilanguage environment.
I know it is a IPP client/browser issue more than a protocol issue,
But it is improtant for Asian like me.

 We have at least three Kanji codes: Chinese, Japanese, and Korean. But
in the specification of ISO10646-1(UCS-4), most of them were combined into
a sigle page, called "CJK charcter set".
 The problem is, some of Kanji charcters in CJK are "Looks similer" but
have defferent "faces" depending on the language which the charcter
"belongs to".

 In extreme cases, one string can include several languages like:

The document named "Woo Hoi Chang" was printed from "Aoyama Tokyo".
                    ~~~~~~~~~~~~~                    ~~~~~~~~~~~~
                   (Chinese Kanji)                  (Japanese Kanji)

 In that case, even RFC2069 (Adding a language information to each strings)
is not enough. Much less, current version of IPP could have only one
language information for all text attributes within a session.

In HTML4.0, "LANG" tag is defined so we can describe like:

The document named <LANG="chinese">Woo Hoi Chang</LANG> was printed from
<LANG="japanese">Aoyama Tokyo</LANG>.

 But I don't feel like to use HTML as IPP 1.0 presentation layer, it's too
heavy to implement for clients.

 Practically, we Asian can know what does the word mean evenif the details
are slightly different (like you guys can know "colour" is the same word
as "color").
 And I think we will implement CJK difference as "assuming native language".
In the case above, all kanjis will be displaied as "Japanese Kanjis" in Japan,
and will be "Chinese Kanjis" in China.

 But the problem still remains, especially for describing human names or
name of places. We have to know EXCACTLY CORRECT kanjis to identify the
particular persons/places, mostly because historical reasons. Like in
English, "Colour" and "Color" is the same but "Kristen" and "Cristen" are
definitely different.
 Unfortunatelly, we don't have the standard method to use CJK in multi-
language environment(except HTML4). Even in a single language(e.g Japanese),
we are still strugging to use too many charcters in the limited capacity
of Unicode CJK.

 Do you think it is okay to use "native language" as default language to
handle CJK charcters (in other words, "depends on implementation")? 
 I think we have no alternetive other than it. This will spoil the excact
international interoperability from IPP, but the problem is rooted on
Unicode CJK itself, not the matter of IPP. I hope future version of IPP
(and Unicode) will solve this problem.

 Sorry for persistance of this issue and (I gueess) make you guys confused.
But I'm afraid if IETF people point out that it is unclear how to handle
CJK charcters in IPP specification.

Well, it is clear. Just say, "Depends on implementation" ;-).

Sincerely,
--------
Yuji Sasaki
E-Mail:sasaki at jci.co.jp