PMP Mail Archive: PMP> what happens if we use UTF-8 in place of ASCII

PMP> what happens if we use UTF-8 in place of ASCII

Harry Lewis (harryl@us.ibm.com)
Fri, 25 Jul 1997 14:36:34 -0400

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Harry Lewis: "RE: PMP> Clarification on HP's implementation of OCTET STRIN"
Previous message: Tom Hastings: "PMP> FWD: Web site L10n announcement"
Next in thread: David_Kellerman@nls.com: "Re: PMP> what happens if we use UTF-8 in place of ASCII"

David, thanks for changing the name and focus of this topic. I have a
question... does using UTF-8 put any greater burden on the agent in terms
of size of character set and storage required? I've been trying to catch up
with all those URGENT mail messages, I've pulled RFC 2044 and I think I
know what UTF-8 is and why it exists... but I'm not sure what the consensus
is regarding how much of Unicode or ISO 10646 is required behind the UTF-8.

Harry Lewis - IBM Printing Systems

-------- Forwarded by Harry Lewis/Boulder/IBM on 07/25/97 12:27 PM -------

pmp-owner@pwg.org
07/24/97 06:50 PM
Please respond to pmp-owner@pwg.org @ internet

To: pmp@pwg.org @ internet
cc:
Subject: PMP> what happens if we use UTF-8 in place of ASCII

New thread; got tired of looking at that "URGENT" line. Also, I figured
that if this note had a new title, your curiousity would get the better
of you, and you might read the note instead of automatically filing it.

Thought it would be useful to summarize the implications of switching
the ASCII OCTET STRINGs to Utf8String syntax.

Agents:
1. Need to make sure any read-only strings that aren't ASCII use UTF-8
encoding. I'd guess that 99% are ASCII.
2. Remove any 7-bit checking code. Sounds like most agents already
accept 8-bit codes without checking (sounds like Osicom/DPI actually
checks -- someone knew what ASCII meant).
So most agents work unchanged, the rest with slight change. It's worth
noting that, within the SNMP domain, the agent should be able to treat
the affected strings simply as octets. It doesn't do anything, such as
displaying the strings, that would require it to "know" UTF-8.

Agent environment:
1. If, outside the SNMP domain, the agent device displays the affected
strings, there may be a need for character set conversion.

Application environment:
1. In environments which do not use UTF-8 as the native encoding,
convert codes to/from UTF-8.
It's worth noting that existing applications that operate in an ASCII
environment aren't affected.

As you think about "what does this mean for my management application,"
here's another thing to consider. Existing applications that use
non-UTF-8 codes will also "work" as long as they don't run up against
another application that expects UTF-8 (or another encoding) -- as noted
above, the agent doesn't know whether the code really UTF-8 or not. And
we ought to keep in mind that the installed base of applications
consists mostly of one vendor's management application talking to the
same vendor's printers.

In practice, this is no worse than the situation today with applications
that have used non-ASCII character encodings in these strings. Two
applications that have opted for different code don't interoperate. But
a non-standard application operating by itself can get away with it.

Curiously, it also is no worse than a multiple-character set solution
where two applications have chosen different character sets. With the
multiple-character set solution, we would be in effect "standardizing"
this lack of interoperability.

:: David Kellerman Northlake Software 503-228-3383
:: david_kellerman@nls.com Portland, Oregon fax 503-228-5662

Next message: Harry Lewis: "RE: PMP> Clarification on HP's implementation of OCTET STRIN"
Previous message: Tom Hastings: "PMP> FWD: Web site L10n announcement"
Next in thread: David_Kellerman@nls.com: "Re: PMP> what happens if we use UTF-8 in place of ASCII"