PMP Mail Archive: Re: PMP> Revised proposal on definition of OCTET STRING ...

Re: PMP> Revised proposal on definition of OCTET STRING ...

Chris Wellens (chrisw@iwl.com)
Tue, 22 Jul 1997 10:56:28 -0700 (PDT)

Here is my second message (sorry these got out of order).

Here is an email exchange between Randy Presuhn and I where Randy
explains in some detail the use of UTF-8 in the sysappl MIB and the
plan for SNMPv3. The reason I include this is to provide the technical
background on the reasons/advatages for this approach.

Tom Hastings' proposal, by clarifying the definitions of what is ASCII,
allows most implementations to remain unchanged, has minimal impact on
the MIB itself, but would allow other implementations to take advantage
of the things that Randy describes below.

>From rpresuhn@peer.com Thu Jul 10 17:49:12 1997
Date: Wed, 2 Jul 1997 12:13:48 -0700
From: Randy Presuhn <rpresuhn@peer.com>
To: chrisw@iwl.com
Cc: cheryl@empiretech.com
Subject: Re: Use of utf8string in sysappl mib

Hi Chris -

> Date: Wed, 2 Jul 1997 11:30:57 -0700 (PDT)
> From: Chris Wellens <chrisw@iwl.com>
> Reply-To: Chris Wellens <chrisw@iwl.com>
> To: cheryl@empiretech.com, rpresuhn@peer.com
> Subject: Use of utf8string in sysappl mib
> Message-Id: <Pine.SUN.3.93.970702111859.12472A-100000@iwl.iwl.com>
>
> Hi Cheryl and Randy,
>
> We are moments away from finishing the Printer MIB, and have one
> last difficult issue -- localization.

Cool.

> We were trying to cram this in as a last minute change, and were
> planning to modify about a dozen octet strings to make them TCs. Then
> in talking to Dave Perkins and others, it was suggested that there
> should be a more general approach that would be useful for all SNMP
> MIBs and not unique or specific to the printer MIB. I also
> noticed that RFC2130 on this subject gives examples of how
> current IETF protocols should be changed to support
> localization, however, there is no discussion of SNMP.
>
> I noticed that in the sysAppl mib you are using Utf8String and
> LongUtf8String.
>
> So, I was hoping that you had perhaps solved this problem for us
> and we could cut and paste from your work :-). However, was your
> plan to do this as phase 1? and then address the other protocol
> oriented issues later?

I'd suggest cloning or referencing the TCs we did.

UTF-8 support is pretty painless at the agent side: bits is bits.
Likewise, on the manager side, a user interface can display and
handle input for whatever it happens to support.

This is actually not as bad as the problem of NVT ASCII with the high
bit set, since at least with UTF-8 there is no question of what a given
code point is supposed to represent.

As a practical matter, regardless of whether NVT ASCII or UTF-8 is in use,
a robust user interface needs to provide a way of entering and displaying
code points not directly supported by the input or display device.

Note that if the data being shovelled around is 7-bit ASCII,
the UTF-8 representation is identical. All that changes is the
processing of stuff with the high-order bit set on display/entry.

> At the moment the majority of the WG is feeling that this is a
> huge black hole and that we cannot possibly address it
> adequately at this late date. I'm inclined to agree, but feel
> it is my ethical duty to seek out all possible solutions. If
> you've got one, we should try and use it. Can you help?
...

It is possible, but not necessary, to turn it into a black hole.
(See the acap mailing list for an example.) My own perspectives:

1) support for almost every language in use. This is a HUGE win.
2) little or no impact on the agent side (the only
issue is the implementation-specific approach to mapping into
the target system's native character set; the easiest thing
to do is to just leave it in UTF-8 all the time. (IBM faced
similar issues in agentx regarding NVT/EBCDIC translation,
as did we in our MVS port; in both cases the final decision
was that trying to translate stuff causes more problems than
its worth.)
3) little impact on the manager side. Robust manager code
already needs to cope with code points not directly supported
by input or display hardware. Compared to the effort required
to handle other internationalization/localization issues
(if a product is internationalized/localized at all), this
gets lost in the noise. If the product is stridently English
only, all it needs to do is not explode when it gets one of
these interesting code points. Since this can allready occur
with NVT ASCII, this is not a new problem.
4) The acap issues of language/locale tagging, while interesting,
appear to me to be serious overkill for management applications.
There is even some argument about whether the requirement for
language tagging in RFC 2130 is overstated; the current
discussions would indicate that what is really intended is locale
rather than language information. Until this is sorted out,
I think it's best for us to not dive into that particular
debate. For the needs of management systems, the locale of
the management system should control display and data entry,
rather than information included in attribute values.
5) This is the direction the world is going.

We're in the process of defining a TC to go along with the SNMPv3 work
to use UTF-8 for all the human-readable information in the MIBs
associated with the management of SNMPv3. Current status:
1) It will be UTF-8
2) Whether it can be meaningfully constrained to "printable"
is under investigation. (It looks to me like such a constraint
is more trouble than its worth, but I'm still looking into it.)

---------------------------------------------------------------------
Randy Presuhn BMC Software, Inc. (Silicon Valley Division)
Voice: +1 408 556-0720 (Formerly PEER Networks) http://www.bmc.com
Fax: +1 408 556-0735 1190 Saratoga Avenue, Suite 130
Email: rpresuhn@bmc.com San Jose, California 95129-3433 USA
---------------------------------------------------------------------
In accordance with the BMC Communications Systems Use and Security
Policy memo dated December 10, 1996, page 2, item (g) (the first of
two), I explicitly state that although my affiliation with BMC may be
apparent, implied, or provided, my opinions are not necessarily those
of BMC Software and that all external representations on behalf of
BMC must first be cleared with a member of "the top management team."
---------------------------------------------------------------------
>From rpresuhn@peer.com Thu Jul 10 17:49:24 1997
Date: Wed, 2 Jul 1997 19:18:41 -0700
From: Randy Presuhn <rpresuhn@peer.com>
To: chrisw@iwl.com
Cc: cheryl@empiretech.com
Subject: Re: Use of utf8string in sysappl mib

Hi Chris -

> Date: Wed, 2 Jul 1997 14:43:52 -0700 (PDT)
> From: Chris Wellens <chrisw@iwl.com>
> To: Randy Presuhn <rpresuhn@peer.com>
> Cc: cheryl@empiretech.com
> Subject: Re: Use of utf8string in sysappl mib
> Message-Id: <Pine.SUN.3.93.970702144317.13280D-100000@iwl.iwl.com>
>

The important thing that folks need to understand about UTF-8 support
is that a product does NOT have to UNDERSTAND every single one of the
40,000+ code points defined in IS 10646; a product does NOT need to be
able to display Chinese to support UTF-8. Rather, it just needs to use
the correct code points to represent information from whatever languages
it DOES support. Any other code points can be treated as opaque stuff
to shovel around, just like any other mystery bytes one might receive
in a DisplayString. (Concrete example: how should 0xFF be interpreted?
In NVT, the answer is unclear. With 10646, the answer is simple.)
For the gory details, see
http://www.unicode.org/Unicode.charts/normal/U+0080.html

Sorry if I sound like I'm lecturing. I care about this a lot.
I wish you the best of luck in getting this adopted!

---------------------------------------------------------------------
Randy Presuhn BMC Software, Inc. (Silicon Valley Division)
Voice: +1 408 556-0720 (Formerly PEER Networks) http://www.bmc.com
Fax: +1 408 556-0735 1190 Saratoga Avenue, Suite 130
Email: rpresuhn@bmc.com San Jose, California 95129-3433 USA
---------------------------------------------------------------------
In accordance with the BMC Communications Systems Use and Security
Policy memo dated December 10, 1996, page 2, item (g) (the first of
two), I explicitly state that although my affiliation with BMC may be
apparent, implied, or provided, my opinions are not necessarily those
of BMC Software and that all external representations on behalf of
BMC must first be cleared with a member of "the top management team."
---------------------------------------------------------------------

-----------------------------------------------------------------------------
--==--==--==- Chris Wellens President & CEO
==--==--==--= Email: chrisw@iwl.com Web: http://www.iwl.com/
--==--==--==- InterWorking Labs, Inc. 244 Santa Cruz Ave, Aptos, CA 95003
==--==--==--= Tel: +1 408 685 3190 Fax: +1 408 662 9065
-----------------------------------------------------------------------------