PMP Mail Archive: PMP> Re: JMP> RE: Localization in the Job MIB [and Printer

PMP> Re: JMP> RE: Localization in the Job MIB [and Printer

Tom Hastings (hastings@cp10.es.xerox.com)
Fri, 11 Jul 1997 09:52:18 PDT

The info in Ira's mail is more about the Printer MIB.

The current language for the Printer MIB for the unlocalized objects
is that the data SHALL be 7-bit US-ASCII [== ISO 646:1992 IRV].
The 8th bit MUST be 0. No codes from decimal 128 to 255 SHALL be
returned in a Get.

It that a problem for anyone? If it is, then this is a significant
comment on the Printer MIB. We could relax this restriction (which
was the point of Ira's fixes that have been removed at the last minute).

It quite possible that most people did not understand the significance
of removing Ira's stuff which forces most Printer MIB objects
(except for the ones under control of prtLocalization) back
to 7-bit ASCII with the 8th bit = 0.

Tom

>Return-Path: <jmp-owner@pwg.org>
>Date: Fri, 11 Jul 1997 06:49:13 PDT
>From: imcdonal@eso.mc.xerox.com (Ira Mcdonald x10962)
>To: bpenteco@boi.hp.com, hastings@cp10.es.xerox.com
>Subject: Re: JMP> RE: Localization in the Job MIB [and question about PJL]
>Cc: jmp@pwg.org
>Sender: jmp-owner@pwg.org
>
>Hi Bob and Harry,
>
>Tom, thanks for clarifying the point I've been trying to make for the
>last three months, to whit, "There is no such thing as 8-bit ASCII".
>
>Harry, Xerox will use private MIB objects (probably) to determine
>usable full 'static locale' for the strict ASCII objects in the
>printer MIB. Note that since the Printer MIB only says they shall
>be in ASCII, they may NOT contain any character with the high-order
>bit set (eg, all ISO 8859-n character sets are precluded, because
>all of their 'useful' non-ASCII characters are in the high end).
>
>Following the advice of RFC 2130, we expect to clarify (in private
>MIB objects, which we'll publish, if I can convince enough of
>the 'Intellectual Property' types around Xerox) that the TES
>(Transfer Encoding Syntax) of these Printer MIB strings is
>US-ASCII, that the CES (Character Encoding Scheme) is UTF-7
>(so-called 'mail-safe' transformation of UCS-2, in RFC 2152),
>and that the underlying CCS (Coded Character Set) is UCS-2
>(per ISO 10646). In the cases where Unicode is not suitable,
>we may have to use other underlying CCS's (and of course also
>different) CES's. If IANA registries provide enum's for all
>of these, then we'll use them - otherwise, we'll define our
>own (private) enum's.
>
>My FujiXerox compatriots are not very happy with the PWG right
>now, but we still have fully internationalized product to get
>out the door, so we won't wait for Printer MIB v3 to adopt
>a fix.
>
>Scott Isaacson, be careful that your implementors do NOT put
>either straight Unicode (UCS-2) or 8-bit UTF-8 into the
>'ptrChannelInformation' object as data. As the Printer MIB v2
>is written, that is strictly forbidden. To use Unicode, you
>must fold it into UTF-7 (pure US-ASCII output).
>
>Feedback from any PWG members on localization issues is much
>appreciated.
>
>Cheers,
>- Ira McDonald (outside consultant at Xerox)
> High North Inc
> PO Box 221
> Grand Marais, MI 49839
> 906-494-2434
>
>----------------------------- Tom's note ----------------------------
>>From jmp-owner@pwg.org Thu Jul 10 23:52:29 1997
>Return-Path: <jmp-owner@pwg.org>
>Received: from zombi (zombi.eso.mc.xerox.com) by snorkel.eso.mc.xerox.com
(4.1/XeroxClient-1.1)
> id AA11143; Thu, 10 Jul 97 23:52:29 EDT
>Received: from alpha.xerox.com by zombi (4.1/SMI-4.1)
> id AA20968; Thu, 10 Jul 97 23:49:29 EDT
>Received: from lists.underscore.com ([199.125.85.31]) by alpha.xerox.com
with SMTP id <54246(6)>; Thu, 10 Jul 1997 20:49:18 PDT
>Received: from localhost (daemon@localhost) by lists.underscore.com
(8.7.5/8.7.3) with SMTP id UAA18594 for <imcdonal@eso.mc.xerox.com>; Thu, 10
Jul 1997 20:27:04 -0400 (EDT)
>Received: by pwg.org (bulk_mailer v1.5); Thu, 10 Jul 1997 20:26:11 -0400
>Received: (from daemon@localhost) by lists.underscore.com (8.7.5/8.7.3) id
UAA18485 for jmp-outgoing; Thu, 10 Jul 1997 20:25:41 -0400 (EDT)
>Message-Id: <9707110020.AA05250@zazen.cp10.es.xerox.com>
>X-Sender: hastings@zazen
>X-Mailer: Windows Eudora Pro Version 2.1.2
>Mime-Version: 1.0
>Content-Type: text/plain; charset="us-ascii"
>Date: Thu, 10 Jul 1997 17:17:49 PDT
>To: Bob Pentecost <bpenteco@boi.hp.com>
>From: Tom Hastings <hastings@cp10.es.xerox.com>
>Subject: JMP> RE: Localization in the Job MIB [and question about PJL]
>Cc: jmp@pwg.org
>Sender: jmp-owner@pwg.org
>Status: R
>
>A quick lesson in coded character sets (I was the chairman of the ANSI
>X3L2 committee responsible for ASCII (and worked on ISO 8859 and ISO 10646):
>
>1. The coded character set called ASCII is defined to be the control
>characters 0 through 31 and the printing characters 32 (SPACE) through
>126. 127 is DEL (a control) and 128 to 255 are NOT part of the ASCII
>coded character set. So ASCII is really a 7-bit set that is usually
>embedded in an 8-bit transmission, so that the high order bit SHALL be 0.
>
>Unfortunately, people use the term "ASCII" to mean any coded character set,
>such as the PJL manual. (Don't feel bad, you are not alone).
>
>2. ISO Latin-1 (ISO 8859-1) is one of the 8-bit coded character sets which
>defined printing characters only: 32 to 126 (same as ASCII fortunately),
>and 160 to 255. The characters in 160-255 are accented letters and
>special symbols, such as pound and yen, more quotes, etc. There are
>at least 11 8-bit coded character sets defined by ISO 8859-n.
>The windows default character set is ISO Latin-1 plus Microsoft filled
>in additional characters in the unspecified space: 128-159.
>
>HP has an 8-bit coded character set that is similar to ISO Latin-1.
>(I've forgotten the name).
>
>Presumably, if a PJL printer gets an attribute with the 8th bit set,
>and that is the value of an attribute that the Printer prints on
>the banner page, such as the user's name, then some coded character
>set is being used for the codes in the range 128-255.
>
>So what does PJL do with characters greater than 127?
>Is it ISO Latin-1?
>Is it the HP 8-bit set?
>Is is Windows default 8-bit set?
>Unspecified so it can be any?
>The administrator can set the default set locally for the printer?
>
>Thanks,
>Tom
>
>
>At 15:33 07/09/97 PDT, Bob Pentecost wrote:
>>Harry,
>>
>>You are very close to being correct. To quote the PJL Tech Ref Manual, PJL
>"strings consist of any combination of characters from ASCII 32 through 255,
>plus ASCII 9 (horizontal tab), excluding ASCII 34 (quotation marks)." There
>is no localization information provided.
>>
>>Bob
>>
>>
>>----------
>>From: Harry Lewis[SMTP:harryl@us.ibm.com]
>>Sent: Wednesday, July 09, 1997 3:32 PM
>>To: hastings@cp10.es.xerox.com
>>Cc: jmk@underscore.com; bpenteco@boi.hp.com; rbergma@dpc.com
>>Subject: Localization in the Job MIB
>>
>>Tom, I think it was David Perkins who wrote (about the Job MIB)...
>>
>>>For simplicity, this specification assumes that the clients, job monitoring
>>>applications, servers, and devices are all running in the same locale.
>>>However, this specification allows them to run in any locale, including
>>>locales that use two-octet coded character sets, such as ISO 10646
>>>(Unicode). Job monitors applications are expected to understand the coded
>>>character set of the client (and job), server, or device. No special means
>>>is provided for the monitor to discover the coded character set used by jobs
>>>or by the server or device. This specification does not contain an object
>>>that indicates what locale the server or device is running in, let alone
>>>contain an object to control what locale the agent is to use to represent
>>>coded character set objects.
>>
>>While I sympathize with the localization problem - (I think I'm beginning
>>to understand Ira's arguments as they pertain to the Printer MIB), I think
>>we have a real limitation in the Job MIB in that we are ultimately limited
>>by the Job Submission protocol or language. Bob can correct me if I'm wrong,
>>but if we take PJL as an example of a pervasive submission language the
>>attributes passed in will be limited to ASCII characters 32 to 225 plus
>>"tab". I don't think there is a way to localize these strings or for the
>>agent to determine the local - but I could be wrong.
>>
>>If we want to do something to accommodate submission protocols which *do*
>>facilitate localization of passed in attributes, we may entertain this,
>>but only if it allows for status-quo as a default.
>>
>>Harry Lewis - IBM Printing Systems
>>
>>
>>
>>
>
>
>
>