IPP> MOD - Proposal for natural language and charset

IPP> MOD - Proposal for natural language and charset

Tom Hastings hastings at cp10.es.xerox.com
Mon Oct 6 22:48:07 EDT 1997


This mail note contains a proposal that Ira and I took on at the 10/01/97
telecon to propose clarifications and additions to the 9/26/97 Model
document that are required in order to enter on the standard track with
respect to the Language and CharSet policies of the IETF that are currently
on standards track themselves.  See RFC 1766 and 2184 and
<draft-alvestrand-charset-policy-01.txt>.  RFC 2130 (IAB Character Set
Workshop, Feb 1997) provides background material.


I've posted the files in:


ftp://ftp.pwg.org/pub/pwg/ipp/new_MOD/
-rw-r--r--   1 pwg      pwg        20480 Oct  7 02:35 langchar.doc
-rw-r--r--   1 pwg      pwg        10447 Oct  7 02:35 langchar.pdf
-rw-r--r--   1 pwg      pwg         9187 Oct  7 02:35 langchar.txt


and have attached the .txt here.


I'd like to discuss this at the telecon, this Wednesday 10/8.  If there
is agreement on the points, we'll draft the spec for the attributes.


The text herein is the specific points for the Internationalization
Considerations section of the IPP Model document.


Send comments before the telecon if possible.


Thanks,
Tom


Subj:  Proposal for IPP to meet IESG Language and CharSet requirements
From: Tom Hastings and Ira McDonald
Date:  10/6/97
File:    langchar.doc


This document contains a proposal that Ira and I took on at the 10/01/97
telecon to propose clarifications and additions to the 9/26/97 Model
document that are required in order to enter on the standard track with
respect to the Language and CharSet policies of the IETF that are currently
on standards track themselves.  See RFC 1766 and 2184 and
<draft-alvestrand-charset-policy-01.txt>.  RFC 2130 (IAB Character Set
Workshop, Feb 1997) provides background material.


NOTE - For clarity of the Model and Semantics document and the mapping in
the protocol document, we have included as operational attributes in the
Model specification any attribute that is mapped to HTTP/1.1 Protocol as
headers.  Any such mapping to HTTP/1.1 is indicated as a parenthetical
remark in the Model specification.  We think that URIs should also be
brought back into the Model document as operational attributes that are
mapped to the appropriate place in the HTTP/1.1 protocol as well.




1. Summary of Changes


This proposal adds/clarifies attributes so that the client supplies the
language and charset of the request and the Printer object responds in the
requested language and charset, if supported, or in the Printer's default
language and charset which can be queried.  This functionality is provided
by two Job Template attributes:  (1) "content-natural-language" and (2)
"content-charset" which are also operational attributes in query
operations.  As with any Job Template attribute, there are also the
corresponding Printer "content-natural-language-default",
"content-charset-default", "content-natural-language-supported", and
"content-charset-supported" attributes.  There are also two
"document-natural-language" and "document-charset" operational attributes
for independently specifying the content of documents for those media types
that need such.  This proposal is intended to replace existing natural
language and charset attributes in the Model and Semantics document.




2. Specific Internationalization Provisions


The following provisions are listed for inclusion in the
Internationalization Considerations section of the IPP Model specification.
 They also serve as a detailed summary of the capabilities.


The IPP Model and Semantics specification provides the following
internationalization capabilities:


1. Clients SHALL supply the HTTP/1.1 Content-Type header with the value: 


   application/ipp; charset=xxx


where xxx specifies the charset used by the body of the IPP request
(independent of the document content) and SHALL be a charset registered
with IANA [REG]


The Printer object SHALL support charset=utf-8 [28].  Support of other
charsets is OPTIONAL, but all supported charsets SHALL be ones in which the
code points from decimal 20 to 127 are US-ASCII [US-ASCII].


For conforming IPP Printer objects, the utf-8 charset SHALL be restricted
to mean conformance level 2 of ISO 10646 [ISO-10646], so that accented
letters SHALL not represented with non-spacing accents.


2. The term "natural language" is used to avoid confusion with "printer
language", since "printer language" is the name given by IANA to the
registration of print document formats as used in the Printer MIB, RFC 1759
[1].


3. The client SHALL supply and the Printer object SHALL support the
single-valued "content-natural-language" and "content-charset Job Template
attributes in order to meet the IETF Policy on Character Sets and Languages
[IETF-Pol].  (The protocol SHALL convey these attributes as the HTTP/1.1
"content-language" and "content-charset" headers [23]).


Each value of an attribute in a request or a response with attribute syntax
'text' or 'name' MAY be in any language and/or charset and SHALL be tagged
using the syntax and mechanism in RFC 2184 [RFC-2184], to indicate the
language and charset of the value.


If the natural language or charset is the same as that supplied by the
client in the request, then the empty tag may be used consisting of only a
single ' character for the language or charset.


Examples:  If the client supplies "content-natural-language" as 'en' and
the "content-charset" as 'us-ascii', the following are valid
representations of the "job-name" attribute in the request and in the
response:  


us-ascii'en-us'Monthly Report
us-ascii''Monthly Report
'en-us'Monthly Report
''Monthly Report
iso-8859-1'fr'Rapport Mensuel


4. As with any supported Job Template attribute, the Printer object SHALL
support the corresponding "content-natural-language-default",
"content-natural-language-supported", "content-charset-default", and
"content-charset-supported" Printer attributes.  However, these Job
Template attributes are MANDATORY for the Printer to support.


5. IPP Printer objects NEED NOT support HTTP/1.1 Accept headers and IPP/1.0
does not address the processing semantics for HTTP/1.1 Accept headers.


6. For the create operations the Printer object shall store in the Job
Object any 'text' and 'name' Job Template attributes in the language and
charset as supplied by the client, along with the values of the
"content-natural-language" and "content-charset" attributes.  In a
subsequent query request, the Printer object NEED not convert any 'text' or
'name' attributes that had been supplied in the create request to the
natural language or charset of the requester.  However, any 'text' or
'name' attributes generated by the Printer SHALL be returned in the natural
language and charset specified by the requester in the
"content-natural-language" and "content-charset" operational attributes of
the query.  However, if the specified natural language or charset is not
supported by the Printer, the Printer SHALL respond in its default natural
language or charset, rather than returning an error.


7. Notifications SHALL always include non-empty language and charset tags
according to RFC 2184, since the recipient might not know what empty tags
mean.


8. Type 4 keywords are often names defined by a system administrator and so
may be in any language or charset.  Therefore, attributes with attribute
syntax 'type4 keyword' are allowed to have a mixture of 'keyword' tokens
defined by the standard and/or registered as keywords (in U.S. English and
US-ASCII) and 'name' tokens defined by the system administrator.  Such
names SHALL follow the syntax for names and SHALL be tagged using the
syntax and mechanism of RFC 2184, while the keywords SHALL not be tagged
(since keywords SHALL always be in US-English and US-ASCII).


9. No IPP Printer implementation NEED perform actual translation of natural
language text and name values.  A Printer object that supports multiple
languages, often has separate catalogs of messages, one for each natural
language.


10. The natural language and charset used in certain document formats may
be specified separately and independently from that used by attributes that
are part of the IPP protocol.  Therefor, in order to indicate the natural
language and charset of these document-formats, the
"document-natural-language" and "document-charset" operation attributes are
defined for use in the create, Send-Document, and Send-URI operations.  For
those media types that are not defined to include a charset parameter, such
as application/octet-stream, the "document-charset" attribute MAY be used
to convey the charset.  For those media types that allow the charset to be
specified as a parameter, such as 'text/plain', the client SHALL also
supply the "document-charset" attribute with the same value.  Then the
Printer object NEED not parse the "document-format" media type attribute
value.


The Printer object SHALL support the "document-natural-language" and
"document-charset" operational attributes, if the Printer object supports
document formats which require them in order to be unambiguous and to
follow the IETF Character Set and Language policy which forbids an
implementation to assume a default language or charset.




3. Additions to the References section


[ISO-10646] ISO/IEC 10646-1:1993, "Information technology -- Universal
Multiple-Octet Coded Character Set (UCS) - Part 1: Architecture and Basic
Multilingual Plane, JTC1/SC2."


[REG] N. Freed, J. Postel:  IANA CharSet Registration Procedures, Work in
Progress (draft-freed-charset-reg-02.txt).


[US-ASCII] Coded Character Set - 7-bit American Standard Code for
Information Interchange, ANSI X3.4-1986.


[IETF-Pol] H. Alvestrand, "IETF Policy on Character Sets and Languages,
work in progress <draft-alvestrand-charset-pollicy-01.txt>, June 1997.


[RFC-2184] N. Fried, K. Moore, MIME Parameter Value and Encoded Word
Extensions: Character Sets, Languages, and Continuations, RFC 2184, August
1997.



More information about the Ipp mailing list