PMP Mail Archive: Re: PMP> ISSUE: OCTET STRING MUST be US-ASCII in 0-127; allow

Re: PMP> ISSUE: OCTET STRING MUST be US-ASCII in 0-127; allow

Tom Hastings (hastings@cp10.es.xerox.com)
Wed, 16 Jul 1997 22:24:11 PDT

David,

Looks like I stepped into a mine field when including prtChannelInformation
in the proposal to allow other characters in code positions 128-255
(by changing its SYNTAX from DisplayString to OCTET STRING).

Talking about the other parts of the proposal (which you didn't disagree with):

1. Talking about the clarification of OCTET STRING itself:

My real object in this clarification of the meaning of the OCTET STRING
data type was to make the Printer MIB draft spec agree with actual
implementations of RFC 1759 that use an 8-bit set, like ISO Latin1
or HP Roman8, for the large number of objects that are specified
as OCTET STRING in RFC 1759 and are NOT the subject of localization
by our prtGeneralCurrentLocalization object. I was also trying to legitimize
implementation of these objects using the Japanese mixed one byte (US-ASCII)
and two-byte (Kanji=JIS X0208) implementations.

I was also trying to preserve the main benefits of the localization proposals
that we were considering at the last minute (last few months), WITHOUT
adding any objects, since we agreed at the last minute not to add those
objects. Just adding clarity and removing ambiguity from the spec
and making the spec agree with what has already been implmented for those
objects that are currently specified as OCTET STRING.

Sound like you don't have a problem with that part of the proposal
for clarifying the meaning of the objects that are currently of
SYNTAX: OCTET STRING (which was the main part of the proposal). Correct?
I should have separated my comments into separate e-mail messages more.

2. So on to the part that you do have a problem with:
So the part that you are disagreeing with is changing the syntax of the
prtChannelInformation object from DisplayString (which is NVT ASCII)
to OCTET STRING, so that it would be subject to my proposed more restrictive
control characters (CR, LF, and maybe HT), but more permissive code positions
128-255. (32-126 must be US-ASCII for DisplayString and my clarification
of OCTET STRING for the Printer MIB).

If we want to restrict prtChannelInformation to NVT ASCII, but clarify
that UTF-7 is a way to cram ISO 10646/Unicode into codes 32-126, that may work
for Novell's NDPS, but let's hear from them. Scott?

Bottom line: ok with me to keep prtChannelInformation as NVT ASCII.
My Oct 1996 mail did recommend DisplayString (NVT ASCII), because I
was thinking keywords. Same as in IPP where we restrict keywords to
ASCII. What gets tricky is when the values of an attribute are text, not
just keywords. For text, restricting to just ASCII is hard for non-English
speakers.

3. But why do we want to restrict our new object: prtGeneralPrinterName
to NVT ASCII? Can't system administrators define printer names in other
character sets (as long as ASCII is in 32-127)? I would suspect that
a printer that is using ISO Latin 1 or HP Roman 8 or JIS X0208 would allow the
accented letters to be part of printer names in Europe or Asia, wouldn't it?

However, I see no reason for the prtGeneralPrinterName to be restricted
to NVT ASCII. Do you? Ok if we change prtGeneralPrinterName from
DisplayString to OCTET STRING?

Bottom line summary of my proposal and my understanding of your reaction:

1. Clarify OCTET STRING to be more restrictive in codes 0-31 and 127 than
NVT ASCII and more permissive in code positions 128 to 255. OK with you.

2. Change SYNTAX of prtChannelInformation from DisplayString (NVT ASCII)
to OCTET STRING. NOT OK with you.

3. Change SYNTAX of prtGeneralPrinterName from DisplayString (NVT ASCII)
to OCTET STRING. OK with you?

Tom

At 11:31 07/16/97 PDT, David_Kellerman@nls.com wrote:
>Ooooh! Mess with other parts of the MIB with past-the-last-minute,
>oh-no-stop-the-presses changes. But start poking at
>prtChannelInformation, and them's fighting words.
>
>> 2. Two existing Printer MIB v2 objects, 'prtGeneralPrinterName' and
>> 'prtChannelInformation' are INCORRECTLY given a SYNTAX of 'DisplayString'
>> which forces NVT ASCII only (code positions 128 to 255 SHALL not be used)
>> instead of 'OCTET STRING' which would give the same capabilities for other
>> sets with US-ASCII as a subset as in 1 above.
>
>INCORRECTLY my fiduciary. This was extensively discussed in meetings
>and on the mailing list. We knew we were forcing NVT ASCII when we
>made this decision -- you were there, too. The use of DisplayString was
>intended to indicate this choice. Here's an excerpt from one of your
>e-mail messages on the subject:
>
> > Date: Tue, 29 Oct 1996 18:25:16 PST
> > To: pwg@pwg.org
> > From: Tom Hastings <hastings@cp10.es.xerox.com>
> ...
> > 1. Want DisplayString as data type, or at least specify in the
DESCRIPTION,
> > so that the data must be the restricted form of ASCII (NVT ASCII?
was is
> > the RFC for it?)
>
>> 3. On page 42, in 'PrtChannelInformationTC', in the 8 July 1997 draft of
>> the Printer MIB v2, we find:
>>
>> -- Keyword: NDSPrinter
>> -- Syntax: Text (Unicode)
>>
>> With a syntax of 'DisplayString' (as currently), this would FORCE the
>> use of UTF-7 (defined in RFC 2152) and would PRECLUDE the use of UTF-8
>> encoding (defined in ISO 10646 and summarized in RFC 2044), to convey
>> this Unicode name. Probably need to change the above from "(Unicode)"
>> to "UTF-8" as well.
>
>We need to specify UTF-7 as the recommended encoding for Unicode. (The
>prtChannelInformation specification says that different channel types
>must specify how data is encoded if it is not naturally NVT ASCII, and,
>in its current form, it gives the UTF-8 encoding of Unicode as an
>example.)
>
>Here is an excerpt from one of my e-mail messages that relates to this:
>
> > Date: Sat, 02 Nov 1996 16:12:03 PST
> > From: David_Kellerman@nls.com
> > To: pwg@pwg.org
> ...
> > Question: UTF-8 apparently uses eight-bit codes, which causes problems
> > if we're saying data values are coded with only NVT ASCII graphic
> > characters (32-126). There's an RFC 1642 that describes UTF-7, a
> > seven-bit representation of Unicode, but I have no idea of whether it is
> > generally accepted. Suggestions from anyone?
>
>As far as I can tell from the e-mail archives, this question went
>unanswered. I ended up leaving a reference to UTF-8 in the
>prtChannelInformation description. This looks like it was a mistake.
>
>:: David Kellerman Northlake Software 503-228-3383
>:: david_kellerman@nls.com Portland, Oregon fax 503-228-5662
>
>