JMP> RE: Localization in the Job MIB [and question about PJL]

JMP> RE: Localization in the Job MIB [and question about PJL]

Ira Mcdonald x10962 imcdonal at eso.mc.xerox.com
Fri Jul 11 09:49:13 EDT 1997


Hi Bob and Harry,


Tom, thanks for clarifying the point I've been trying to make for the
last three months, to whit, "There is no such thing as 8-bit ASCII".


Harry, Xerox will use private MIB objects (probably) to determine
usable full 'static locale' for the strict ASCII objects in the 
printer MIB.  Note that since the Printer MIB only says they shall
be in ASCII, they may NOT contain any character with the high-order
bit set (eg, all ISO 8859-n character sets are precluded, because
all of their 'useful' non-ASCII characters are in the high end).


Following the advice of RFC 2130, we expect to clarify (in private
MIB objects, which we'll publish, if I can convince enough of
the 'Intellectual Property' types around Xerox) that the TES
(Transfer Encoding Syntax) of these Printer MIB strings is
US-ASCII, that the CES (Character Encoding Scheme) is UTF-7
(so-called 'mail-safe' transformation of UCS-2, in RFC 2152),
and that the underlying CCS (Coded Character Set) is UCS-2
(per ISO 10646).  In the cases where Unicode is not suitable,
we may have to use other underlying CCS's (and of course also
different) CES's.  If IANA registries provide enum's for all
of these, then we'll use them - otherwise, we'll define our
own (private) enum's.  


My FujiXerox compatriots are not very happy with the PWG right
now, but we still have fully internationalized product to get
out the door, so we won't wait for Printer MIB v3 to adopt
a fix.  


Scott Isaacson, be careful that your implementors do NOT put
either straight Unicode (UCS-2) or 8-bit UTF-8 into the
'ptrChannelInformation' object as data.  As the Printer MIB v2
is written, that is strictly forbidden.  To use Unicode, you
must fold it into UTF-7 (pure US-ASCII output).


Feedback from any PWG members on localization issues is much
appreciated.


Cheers,
- Ira McDonald (outside consultant at Xerox)
  High North Inc
  PO Box 221
  Grand Marais, MI  49839
  906-494-2434


----------------------------- Tom's note ----------------------------
Return-Path: <jmp-owner at pwg.org>
Received: from zombi (zombi.eso.mc.xerox.com) by snorkel.eso.mc.xerox.com (4.1/XeroxClient-1.1)
	id AA11143; Thu, 10 Jul 97 23:52:29 EDT
Received: from alpha.xerox.com by zombi (4.1/SMI-4.1)
	id AA20968; Thu, 10 Jul 97 23:49:29 EDT
Received: from lists.underscore.com ([199.125.85.31]) by alpha.xerox.com with SMTP id <54246(6)>; Thu, 10 Jul 1997 20:49:18 PDT
Received: from localhost (daemon at localhost) by lists.underscore.com (8.7.5/8.7.3) with SMTP id UAA18594 for <imcdonal at eso.mc.xerox.com>; Thu, 10 Jul 1997 20:27:04 -0400 (EDT)
Received: by pwg.org (bulk_mailer v1.5); Thu, 10 Jul 1997 20:26:11 -0400
Received: (from daemon at localhost) by lists.underscore.com (8.7.5/8.7.3) id UAA18485 for jmp-outgoing; Thu, 10 Jul 1997 20:25:41 -0400 (EDT)
Message-Id: <9707110020.AA05250 at zazen.cp10.es.xerox.com>
X-Sender: hastings at zazen
X-Mailer: Windows Eudora Pro Version 2.1.2
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Thu, 10 Jul 1997 17:17:49 PDT
To: Bob Pentecost <bpenteco at boi.hp.com>
From: Tom Hastings <hastings at cp10.es.xerox.com>
Subject: JMP> RE: Localization in the Job MIB [and question about PJL]
Cc: jmp at pwg.org
Sender: jmp-owner at pwg.org
Status: R


A quick lesson in coded character sets (I was the chairman of the ANSI
X3L2 committee responsible for ASCII (and worked on ISO 8859 and ISO 10646):


1. The coded character set called ASCII is defined to be the control
characters 0 through 31 and the printing characters 32 (SPACE) through
126.  127 is DEL (a control) and 128 to 255 are NOT part of the ASCII
coded character set.  So ASCII is really a 7-bit set that is usually
embedded in an 8-bit transmission, so that the high order bit SHALL be 0.


Unfortunately, people use the term "ASCII" to mean any coded character set,
such as the PJL manual. (Don't feel bad, you are not alone).


2. ISO Latin-1 (ISO 8859-1) is one of the 8-bit coded character sets which
defined printing characters only: 32 to 126 (same as ASCII fortunately),
and 160 to 255.  The characters in 160-255 are accented letters and
special symbols, such as pound and yen, more quotes, etc.  There are
at least 11 8-bit coded character sets defined by ISO 8859-n.
The windows default character set is ISO Latin-1 plus Microsoft filled
in additional characters in the unspecified space: 128-159.


HP has an 8-bit coded character set that is similar to ISO Latin-1.
(I've forgotten the name).


Presumably, if a PJL printer gets an attribute with the 8th bit set,
and that is the value of an attribute that the Printer prints on
the banner page, such as the user's name, then some coded character
set is being used for the codes in the range 128-255.


So what does PJL do with characters greater than 127?  
Is it ISO Latin-1?
Is it the HP 8-bit set?
Is is Windows default 8-bit set?
Unspecified so it can be any?
The administrator can set the default set locally for the printer?


Thanks,
Tom




At 15:33 07/09/97 PDT, Bob Pentecost wrote:
>Harry,
>
>You are very close to being correct. To quote the PJL Tech Ref Manual, PJL
"strings consist of any combination of characters from ASCII 32 through 255,
plus ASCII 9 (horizontal tab), excluding ASCII 34 (quotation marks)." There
is no localization information provided.
>
>Bob
>
>
>----------
>From:  Harry Lewis[SMTP:harryl at us.ibm.com]
>Sent:  Wednesday, July 09, 1997 3:32 PM
>To:  hastings at cp10.es.xerox.com
>Cc:  jmk at underscore.com; bpenteco at boi.hp.com; rbergma at dpc.com
>Subject:  Localization in the Job MIB
>
>Tom, I think it was David Perkins who wrote (about the Job MIB)...
>
>>For simplicity, this specification assumes that the clients, job monitoring
>>applications, servers, and devices are all running in the same locale.
>>However, this specification allows them to run in any locale, including
>>locales that use two-octet coded character sets, such as ISO 10646
>>(Unicode).  Job monitors applications are expected to understand the coded
>>character set of the client (and job), server, or device.  No special means
>>is provided for the monitor to discover the coded character set used by jobs
>>or by the server or device.  This specification does not contain an object
>>that indicates what locale the server or device is running in, let alone
>>contain an object to control what locale the agent is to use to represent
>>coded character set objects.
>
>While I sympathize with the localization problem - (I think I'm beginning
>to understand Ira's arguments as they pertain to the Printer MIB), I think
>we have a real limitation in the Job MIB in that we are ultimately limited
>by the Job Submission protocol or language. Bob can correct me if I'm wrong,
>but if we take PJL as an example of a pervasive submission language the
>attributes passed in will be limited to ASCII characters 32 to 225 plus
>"tab". I don't think there is a way to localize these strings or for the
>agent to determine the local -  but I could be wrong.
>
>If we want to do something to accommodate submission protocols which *do*
>facilitate localization of passed in attributes, we may entertain this,
>but only if it allows for status-quo as a default.
>
>Harry Lewis - IBM Printing Systems
>
>
>
>



More information about the Jmp mailing list