IPP> MOD - Separate 'document-format' and 'document-language'

IPP> MOD - Separate 'document-format' and 'document-language'

Ira Mcdonald x10962 imcdonal at eso.mc.xerox.com
Tue Sep 30 09:50:29 EDT 1997


Hi folks,                                    Tuesday (30 September 1997)


After talking with Larry Masinter yesterday, I WITHDRAW my suggestion
that IPP's 'document-format' attribute be an extended form of a MIME
'media-type' (used in 'Content-Type' headers), with an added 'language'
parameter.


Larry argues that this fosters incoherence (in IETF standard protocols)
and forces an IPP Printer (ie, server application) to sometimes PARSE
'document-format', in order to construct MIME headers for 'Content-Type'
and 'Content-Language' (thus 'document-format' would NOT be opaque to
the IPP server application - this is not good).


Instead, I suggest we have two MANDATORY attributes for job operations
(and the Job Monitoring MIB):


1)  'document-format'
    - value is 'media-type' (with 'charset' for 'text/*' types)
    - maps one-to-one to MIME 'Content-Type' header


2)  'document-language'
    - value is an RFC 1766 compliant language tag
    - maps one-to-one to MIME 'Content-Language' header


There remains one apparent problem with using MIME 'media-types' (see
RFC 2046) for IPP 'document-format' - their possible limitation (see
RFC 2046, section 4.1.2 'Charset Parameter', page 7) to the use of ONLY
US-ASCII (7-bit) or ISO-8859-X (8-bit) character sets.


Support for UTF-8 (RFC 2044, IANA registered character set type for ISO
10646 folded into a multi-octet 8-bit superset of US-ASCII, is critical
for IPP documents.  Support for ALL of the IANA registered character set
types is highly desirable (and coherent with the revised ABNF for MIME
parameter VALUES specified in RFC 2184).


Larry, can you comment on character sets for 'media-types' and hopefully
clarify this for us?


Cheers,
- Ira McDonald (outside consultant at Xerox)
  High North Inc
  906-494-2434


PS - A compelling reason for language tags on all text, stated in RFC
2184 on page 2, is to facilitate text 'reader' software for blind people
(knowing the 'charset' is NOT sufficient to 'read' text aloud).  This is
an ethical consideration of the highest importance.


PPS - Note that a document which contains ONLY graphics and NO text does
not need (or benefit from) 'document-language', but that ANY document
which contains text (no matter what the 'media-type') benefits strongly
from 'document-language' (because the IPP server application need not
parse the document itself to discover imbedded language tags to behave
properly).



More information about the Ipp mailing list