PMP Mail Archive: Re: PMP> URGENT: Need consensus by noon PDT today on definition of OCTET STRING to allow superse

PMP Mail Archive: Re: PMP> URGENT: Need consensus by noon PDT today on definition of OCTET STRING to allow superse

Re: PMP> URGENT: Need consensus by noon PDT today on definition of OCTET STRING to allow superse

Ira Mcdonald x10962 ()
Tue, 22 Jul 1997 09:32:44 PDT

Hi Tom,

I agree completely with your proposal. Abandoning existing implementations
to non-conformance (retroactively) because there was no rigorous
reference for 'ASCII' is just foolish. Thanks to Bob Pentecost for
getting the HP5si information back to Tom quickly.

Scott Isaacson, you've got a serious stake in this one, if you're
listening. As the Printer MIB is now written, you will NOT be
putting Unicode (in ANY form) in the 'prtChannelInformation'
object for Netware NDS PServer and RPrinter devices.

Cheers,
- Ira McDonald (outside consultant at Xerox)
High North Inc
PO Box 221
Grand Marais, MI 49839

--------------------------------- Tom's note ------------------------------
Return-Path: <pmp-owner@pwg.org>
Received: from zombi (zombi.eso.mc.xerox.com) by snorkel.eso.mc.xerox.com (4.1/XeroxClient-1.1)
id AA14618; Tue, 22 Jul 97 12:01:45 EDT
Received: from alpha.xerox.com by zombi (4.1/SMI-4.1)
id AA26023; Tue, 22 Jul 97 11:58:25 EDT
Received: from lists.underscore.com ([199.125.85.31]) by alpha.xerox.com with SMTP id <52971(2)>; Tue, 22 Jul 1997 08:58:32 PDT
Received: from localhost (daemon@localhost) by lists.underscore.com (8.7.5/8.7.3) with SMTP id LAA26299 for <imcdonal@eso.mc.xerox.com>; Tue, 22 Jul 1997 11:54:35 -0400 (EDT)
Received: by pwg.org (bulk_mailer v1.5); Tue, 22 Jul 1997 11:52:38 -0400
Received: (from daemon@localhost) by lists.underscore.com (8.7.5/8.7.3) id LAA26176 for pmp-outgoing; Tue, 22 Jul 1997 11:51:02 -0400 (EDT)
Message-Id: <9707221551.AA08872@zazen.cp10.es.xerox.com>
X-Sender: hastings@zazen
X-Mailer: Windows Eudora Pro Version 2.1.2
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Priority: 1 (Highest)
Date: Tue, 22 Jul 1997 08:48:44 PDT
To: pmp@pwg.org
From: Tom Hastings <hastings@cp10.es.xerox.com>
Subject: PMP> URGENT: Need consensus by noon PDT today on definition of
OCTET STRING to allow superset of ASCII
Sender: pmp-owner@pwg.org
Status: R

I should have indicated the urgency of this in the subject line yesterday.

Please respond by today, either way, by noon PDT.

Tom

>Return-Path: <pmp-owner@pwg.org>
>X-Sender: hastings@zazen
>Date: Mon, 21 Jul 1997 18:48:35 PDT
>To: pmp@pwg.org
>From: Tom Hastings <hastings@cp10.es.xerox.com>
>Subject: FWD: PMP> Revised proposal on definition of OCTET STRING to
> allow superset of ASCII
>Sender: pmp-owner@pwg.org
>
>I just talked with Chris and she would like to have the PMP indicate
>by e-mail by noon PDT, Tuesday, 7/22, whether to make the change that
>I propose or not (attached).
>
>Could you please check with your implementations of the Printer MIB
>to see if they restrict the READ-WRITE objects to US-ASCII, i.e.,
>do they lop of the 8th bit or not. Could you also check to see if
>the read-only objects could have additional characters. We all ready
>have evidence about the practice of the HP 5si (see below). It would
>help to see what other practice exists. In other words,
>my proposal is just to document what implementations are really doing
>(which is my understanding of the IETF process of going from proposed to
>draft).
>
>Also allowing the Printer MIB to use UTF-8 allows an implementor to follow
>the recommendation of the IETF IAB in RFC 2130 that UTF-8 be the default
>character set. Leaving the MIB as it is forbids a conforming implementation
>to follow the recommendation to use UTF-8 at all.
>
>Thanks,
>Tom
>
>>Return-Path: <pmp-owner@pwg.org>
>>X-Sender: hastings@zazen
>>Date: Mon, 21 Jul 1997 16:40:53 PDT
>>To: pmp@pwg.org
>>From: Tom Hastings <hastings@cp10.es.xerox.com>
>>Subject: PMP> Revised proposal on definition of OCTET STRING to allow
>> superset of ASCII
>>Sender: pmp-owner@pwg.org
>>
>>I have heard no objections to the main thrust of my suggestion
>>to allow additional characters in code positions 128-255
>>for objects of syntax OCTET STRING, as long as code positions
>>32-126 remained US-ASCII. The discussion has been about
>>prtChannelInformation (which I have removed from this proposal).
>>There has been no objections to changing the new object:
>>prtGeneralPrinterName from DisplayString to OCTET STRING either.
>>
>>I assume that silence means acceptance on the main thrust
>>of the proposal????
>>
>>However, to be clear, I've simplified the proposal and removed
>>any mention of prtChannelInformation and an re-circulating.
>>
>>I've talked with David Kellerman. As a result I have
>>modified my proposal to avoid mentioning prtChannelInformation (fixing
>>that description will be a separate issue). I have also changed
>>the proposal so that any object of type OCTET STRING SHALL use no control
>>codes, unless specifically specified in the DESCRIPTION (this should
>>cover prtChannelInformation, prtGeneralCurrentOperator, and
>>prtGeneralServicePerson which talk about LF).
>>
>>I have also talked with Bob Pennecost. The HP 5si allows 8-bit data to
>>be written into read-write OCTET STRING objects. We tried
>>prtGeneralCurrentOpoerator and it accepted 8-bit Windows characters
>>and a windows SNMPC application correctly displayed them. Furthermore,
>>the read-write prtInputMediaName object which the 5si will only accept
>>values that have been previously set by the 5si private MIB using 8-bit
>>characters.
>>
>>So we have significant implementation practice of RFC 1759 that is *not*
>>limiting OCTET STRING to US-ASCII (7-bits, code positions 32-126) as specified
>>on page 14, top paragraph. So we need to fix page 14 and add a REFERENCE
>>section.
>>
>>
>>
>>Briefly, the problems with the current Printer MIB draft are:
>>
>>1. There are many objects of type OCTET STRING that are restricted to ASCII.
>>But ASCII is not a clearly defined term and existing practice is in
>>conflict with the most likely of the interpretations. Existing practice
>>is to use US-ASCII (ANSI X3.4) in code positions 0-127 and some other
>>coded character set in code positions 128-255. In other words, current
>>practice is to use 8-bit coded character sets in which code positions
>>0 to 127 are US-ASCII. Examples of such sets are: ISO Latin 1, HP Roman 8,
>>UTF-8, JIS X0208-1990 Japanese two byte set in 128-255 with US-ASCII in
>>0-127, GB 2312-1980 Chinese two-byte set in 128-255 with US-ASCII in 0-127.
>>
>>
>>2. One of the new Printer MIB v2 objects, 'prtGeneralPrinterName' has
>>been given a SYNTAX of 'DisplayString', instead of OCTET STRING
>>which forces NVT ASCII only (code positions 128 to 255 SHALL not be used)
>>instead of 'OCTET STRING' which would give the same capabilities for other
>>sets with US-ASCII as a subset as in 1 above.
>>
>>
>>3. There isn't a proper Bibliography section to refer to other standards
>>that are needed in order to understand references to terms, such as "ASCII",
>>"NVT ASCII", "Unicode", UTF-8, etc.
>>
>>
>>
>>
>>Explanation of the problems with suggested solutions and text.
>>
>>1. There is a serious ambiguity in the 02 Printer MIB draft about the many
>>objects of syntax OCTET STRING that are indicated as not being localized.
>>Page 14 describes them:
>>
>> Localization is only performed on those strings in the MIB that
>> are explicitly marked as being localized. All other character
>> strings are returned in ASCII.
>>
>>There is no reference to what is meant by "ASCII".
>>
>>The number of different interpretations of this includes:
>>
>>a. ANS X3.4, the ANSI standard in positions 0 to 127, 128 to 255 SHALL NOT be
>>used.
>>
>>b. NVT ASCII (RFC 854) in positions 0 to 127, 128 to 255 SHALL NOT be used.
>>NVT ASCII includes the following controls for virtual terminals: NUL (0),
>>LF (10), CR (13), BEL (7), BS (8), HT (9), VT (11), FF (12).
>>
>>c. Some think that it is any coded character set in which ASCII is in the
left
>>hand side, i.e., values 0 to 127 decimal and any other one or two octet coded
>>character set is from values 128 to 255, such as ISO 8859-1 (ISO Latin-1),
the
>>Windows default set, HP Roman8, any of the eleven ISO 8859-n sets, UTF-8, JIS
>>X0208, GB2312, etc.
>>
>>d. And some think it means any coded character set at all, including Unicode,
>>any national 7-bit set, so that ASCII doesn't even have to be in positions
>0 to
>>127.
>>
>>
>>
>>Suggested solution:
>>
>>1. I propose that we clarify the Printer MIB to be interpretation c.
>>I believe that that will also correspond to actual practice of implementing
>RFC
>>1759. For example, any of the ISO 8859-n (Latin 1, etc.) meet this
>>criteria. Also HP's Roman-8 meets this criteria, as does the Windows
>>default 8-bit character set. For Asian markets, they may use either UTF-8
>>which is a tranformation of ISO 10646 (Unicode) that meets this criteria
>>or they may use US ASCII in code points 0 to 127 and their national two byte
>>coded character sets in code points 128 to 255 according to the code structure
>>of ISO 2022 for 8 bit environments.
>>
>>So replace the second sentence of the paragraph on page 14:
>>
>> All other character strings are returned in ASCII.
>>
>>with:
>>
>> The agent SHALL return all other character strings as coded
>> character sets in which code positions 0-127 (decimal) are
>> US-ASCII [US-ASCII] and the remaining values, 128-255, may be any other
>> coded character set, including multi-byte sets according to ISO 2022
>> [ISO 2022] in 8-bit environments. Examples of
>> coded character sets which meet this criteria are: US-ASCII,
>> ISO 646:1991 IRV [ISO 646], ISO 8859-1 (Latin-1) [ISO 8859],
>> any ISO 8859-n, HP Roman8, Windows Default 8-bit set, UTF-8 [UTF-8],
>> US-ASCII plus JIS X0208-1990 Japanese [JIS X0208], GB2312-1980 Chinese
>> [GB2312].
>>
>> Examples of coded character sets which do not meet this criteria are:
>> national 7-bit sets (except US ASCII), EBCDIC, and ISO 10646 (Unicode)
>> [IS 10646]. In order to represent Unicode characters, use UTF-8.
>>
>> Control codes (code positions 0-31 and 127) SHALL NOT be used unless
>> specifically specified in the DESCRIPTION of the object.
>>
>>
>>2. Change the syntax of the MIB object: 'prtGeneralPrinterName'
>> from 'DisplayString' which is restricted to US-ASCII to OCTET STRING,
>> so that other sets may be used in code positions 128 to 255 and so that
>> the restricted set of controls will be specified.
>>
>>
>>3. Add a proper Bibliography section so that the above references
>>can be made. I found a proper reference to US-ASCII in RFC 2044
>>(UTF-8) as:
>>
>> [US-ASCII] Coded Character Set--7-bit American Standard Code for
>> Information Interchange, ANSI X3.4-1986.
>>
>>So it is ok to refer to ANSI standards from IETF standards.
>>
>>
>>
>>So I propose that the Bibligraphy section be:
>>
>> [US-ASCII] Coded Character Set - 7-bit American Standard Code for
>> Information Interchange, ANSI X3.4-1986.
>>
>> [ISO 646] ISO 646:1991, "Information technology - ISO 7-bit coded
>> character set for information interchange".
>>
>> [ISO 8859] ISO 8859-1:1987, "Information technology - 8-bit single
>> byte coded graphic character sets -
>> Part 1: Latin alplhabet No. 1"
>>
>> [ISO 2022] ISO 2022:1994 - "Information technology - Character code
>> structure and extension techniques"
>>
>> [ISO 10646] ISO 10646-1:1993, "Information technology - Universal
>> Multiple-Octet Coded Character Set (UCS) - Part 1:
>> Architecture and Basic Multilingual Plane
>>
>> [UTF-7] Goldsmith, D., and M. Davis, "UTF-7", RFC1642, Taligent,
>> Inc., July 1994.
>>
>> [UTF-8] F. Yergeau, "UTF-8, a transformation format of Unicode
>> and ISO 10646", RFC 2044, October 1996.
>>
>> [NVT ASCII] J. Postel, J. Reynolds, "TELENET PROTOCOL SPECIFICATION",
>> RFC 854, May 1983.
>>
>> [JIS X0208] JIS X0208-1990, "Japanese two byte coded character set."
>>
>> [GB2312] GB 2312-1980, "Chinese People's Republic oF China (PRC)
>> mized one byte and two byte coded character set"
>>
>>
>>
>>
>>
>>For reference:
>>I've extracted all objects of type OCTET STRING from the draft 02.
>>I've put "localized" in front of the ones whose DESCRIPTIONs say are
>>localized according to prtGeneralCurrentLocalization and concole
>>localization in front of the ones whose DESCRIPTIONs say are localized by
>>prtConsoleLocalization:
>>
>> prtGeneralCurrentOperator OCTET STRING,
>> prtGeneralServicePerson OCTET STRING,
>> prtGeneralSerialNumber OCTET STRING,
>>localized prtCoverDescription OCTET STRING,
>> prtCoverDescription OCTET STRING,
>> prtLocalizationLanguage OCTET STRING,
>> prtLocalizationCountry OCTET STRING,
>> prtInputMediaName OCTET STRING,
>> prtInputName OCTET STRING,
>> prtInputVendorName OCTET STRING,
>> prtInputModel OCTET STRING,
>> prtInputVersion OCTET STRING,
>> prtInputSerialNumber OCTET STRING,
>>localized prtInputDescription OCTET STRING,
>> prtInputMediaType OCTET STRING,
>> prtInputMediaColor OCTET STRING,
>> prtOutputName OCTET STRING,
>> prtOutputVendorName OCTET STRING,
>> prtOutputModel OCTET STRING,
>> prtOutputVersion OCTET STRING,
>> prtOutputSerialNumber OCTET STRING,
>>localized prtOutputDescription OCTET STRING,
>>localized prtMarkerSuppliesDescription OCTET STRING,
>> prtMarkerColorantValue OCTET STRING,
>>localized prtMediaPathDescription OCTET STRING,
>> prtChannelProtocolVersion OCTET STRING,
>> prtInterpreterLangLevel OCTET STRING,
>> prtInterpreterLangVersion OCTET STRING,
>>localized prtInterpreterDescription OCTET STRING,
>> prtInterpreterVersion OCTET STRING,
>>console localization prtConsoleDisplayBufferText OCTET STRING
>>console localization prtConsoleDescription OCTET STRING
>>localized prtAlertDescription OCTET STRING,
>>
>>
>>We want to add to the above list:
>>
>> prtGeneralPrinterName OCTET STRING
>> prtChannelInformation OCTET STRING
>>
>>
>>
>>
>
>
>