Printer Services Mail Archive: RE: PS> Further Revised PWG s

RE: PS> Further Revised PWG std MIME parameters ABNF (25 Oct 2002 )

From: McDonald, Ira (imcdonald@sharplabs.com)
Date: Fri Nov 01 2002 - 13:48:04 EST

  • Next message: McDonald, Ira: "RE: PS> Further Revised PWG std MIME parameters ABNF (25 Oct 2002 )"

    Hi Bob,

    Apologies for my slow reply. I was off the map travelling most of this
    week.
    My comments are inline below in your note.

    Cheers,
    - Ira McDonald
      High North Inc

    -----Original Message-----
    From: TAYLOR,BOB (HP-Vancouver,ex1) [mailto:bobt@hp.com]
    Sent: Monday, October 28, 2002 7:58 PM
    To: 'McDonald, Ira'; 'ps@pwg.org'; 'hastings@cp10.es.xerox.com'
    Cc: SIMPSON,SHELL (HP-Boise,ex1)
    Subject: RE: PS> Further Revised PWG std MIME parameters ABNF (25 Oct
    2002 )

    Hi Ira, Tom, all,

    Some comments/questions:

    - We understand the rational for the "cut and paste" vs. "structure"
    objectives,
      but should PSI have a structure for this? As we'll mention in a bit,
    there is
      probably more information we need to capture here, and continual extension
    of
      the MIME string doesn't seem to scale well/reliably with too many
    parameters.
      For reference (I think we've already shared this), attached is a simple
      xsd we're using for this.

    <ira>
    Agreed - the continual extension of the MIME type string does _not_
    work very well.
    </ira>

    - The proposal explicitly states in several places "Human-readable
    information,
      suitable for client UI and debug. Not suitable for use by automata".
    Given
      that we do need to use content type information for automata, is it
    assumed
      that something else (e.g., a "structure" definition) must be defined as
    well?

    <ira>
    The proposal _notes_ the current definition of the proposed information
    in Printer MIB v1 (RFC 1759, March 1995) as "human-readable".

    But PSI shouldn't solve this problem without a clear IPP binding of
    whatever the "solution" is.

    (I can now almost hear myself suggesting an IPP binding based on
    "document-format-col (1setOf collection)" - yuck! - an awful
    solution for IPP).

    A good long-term solution for IPP would be to use the proposed generic
    Resource object and (just as we've proposed for "media" as a _much_
    better solution than the "media-col" attributes) define a Document Format
    type of Resource.
    </ira>

    - To be deterministic about what the format actually is (or what formats are
      supported by a service), we think there are a few additional things
    necessary:
       language -- Human language for which the data format is defined. Some
    word
         processing applications defined different data formats for different
         locales--even though the version for the data format remained the same
         (e.g. early versions of MS Word).

    <ira> Here, we should use IETF RFC 3066 conformant "language-tags". </ira>

       platform -- Operating system for which the data format is defined. Some
         applications (e.g. MS Word) defined different data formats for
    different
         operating systems (e.g. Windows and the Mac) even though the version
    for
         the data format remained the same.

    <ira>
    Here, we're in serious trouble. The (now languishing) IPP device driver
    installation spec needed a better list of operating systems than the current
    IANA registry, but creation of such a registry by the PWG is unacceptable to
    most people.
    </ira>

       model -- Target device for which the data was created. PDLs (such as
         Postscript or PCL) can include printer model specific information.
    (Each
         printer model has its own specific language specification.)

    <ira>
    Can you suggest a deterministic, portable way to enumerate printer models
    that interoperates across software and hardware vendors? Again, a PWG
    registry is an unacceptable (and unworkable) solution, I believe.
    </ira>

       container -- Describes embedded content types, such as those embedded
         within a ZIP file.

    <ira>
    In the XML domain (not the MIME multipart domain), the best way to describe
    embedded content types that I know of is the "Manifest" that's an optional
    part of an XML Digital Signature (RFC 3275, March 2002).

    Again, needs major work for some kind of IPP binding.
    </ira>

    - I noticed that pdl-lang-res is missing from the revised proposal. We
    actually
      think this is the right answer (i.e., it does not belong here), but wanted
    to
      make sure this was intentional and not accidental.

    <ira>
    Tom Hastings convinced me to remove pwg-lang-res (resolution), because it
    "opened the barn door to inappropriate parameters".

    I will note that my own experience at Xerox and Sharp has been that client
    apps folks do read the Interpreter Table in the Printer MIB to determine the
    MAXIMUM resolution supported by a given PDL interpreter. No point in
    sending images at higher resolution, for example.
    </ira>

    thanks,

    bt

    ---------------------------------------------------
    Bob Taylor
    Senior Architect
    IPG Strategic Technology Development
    Hewlett-Packard Co.
    mailto:robertt@vcd.hp.com
    phone: 360.212.2625/T212.2625
    fax: 208.730-5111
    ---------------------------------------------------

    > -----Original Message-----
    > From: McDonald, Ira [mailto:imcdonald@sharplabs.com]
    > Sent: Friday, October 25, 2002 6:29 PM
    > To: McDonald, Ira; 'ps@pwg.org'; 'hastings@cp10.es.xerox.com'
    > Subject: PS> Further Revised PWG std MIME parameters ABNF (25
    > Oct 2002)
    >
    >
    > Hi folks, Friday (25
    > October 2002)
    >
    > [Per my action item from the PWG PSI Telecon on 8 October 2002)]
    >
    > Further revised ABNF for three PWG parameters for MIME
    > document formats.
    >
    > Two of these parameters (pwg-lang-level and pwg-lang-desc)
    > were derived
    > from the InterpreterTable defined in Printer MIB [RFC 1759].
    > The other
    > parameter (pwg-lang-profile) was suggested by Tom Hastings.
    >
    >
    > Problem Statement: Registered MIME types (for example, used as values
    > of the 'document-format' Job attribute in IPP/1.1 [RFC 2910,
    > RFC 2911])
    > are imprecise. PWG PSI, CIP4 JDF, FSG PAPI, and FSG Job
    > Ticket working
    > groups have all identified a requirement for document format metadata.
    >
    >
    > Rejected Solution: A data structure with the required
    > metadata - works
    > well within a given interface - but incompatible with 'cut-and-paste'.
    >
    >
    > Proposed Solution: PWG standard optional MIME parameters - may be
    > appended to _any_ registered MIME type to add the required metadata -
    > compatible with 'cut-and-paste' across all applications (because the
    > parameters are actually _part of_ the same MIME type "word" in the
    > source text).
    >
    >
    > Tom Hastings and I plan to write a proposal for "IPP: PWG
    > Standard MIME
    > Parameters for Document Formats". Watch for an announcement.
    >
    > Cheers,
    > - Ira McDonald, co-editor of Printer MIB v2
    > High North Inc
    >
    > --------------------------------------------------------------
    > ----------
    > [PWG Standard MIME Parameters]
    >
    >
    > In the printer industry, a document format MIME type is one of:
    >
    > (a) PDL - page description language (e.g., HTML or Adobe PostScript)
    > (b) JCL - job control language (e.g., HP PJL)
    > (c) JDL - job definition language (e.g., CIP4 JDF)
    > (d) text - plaintext, richtext, HTML, SGML, XML, etc.
    >
    > Any PWG standard MIME parameter MAY be appended (unordered) to any
    > document format, for example:
    >
    > application/vnd.hp-pcl;pwg-lang-res="400,400,dpi" (HP PCL
    > 400x400 dpi)
    >
    > These PWG parameters are specified in Augmented Backus-Naur
    > Form (ABNF,
    > RFC 2234). Every element used in one of these PWG parameter ABNF
    > productions is defined in an excerpt from ABNF (RFC 2234),
    > MIME Part One
    > (RFC 2045), or Internet Message Format (RFC 2822) at the end of this
    > note.
    >
    > Each parameter name begins with a "pwg-" (namespace) prefix, to ensure
    > that it is safely ignored by existing MIME-enabled software
    > and systems.
    >
    > According to RFC 2045:
    >
    > (1) Parameters MUST be ignored when unrecognized;
    > (2) Parameters MUST be ignored when comparing values of MIME types;
    > (3) MIME type names MUST be treated as case-insensitive;
    > (4) MIME parameter names MUST be treated as case-insensitive;
    > (5) MIME parameter values MUST be treated as case-sensitive.
    >
    >
    >
    >
    >
    > document-format = type "/" subtype *[parameter] *[pwg-parameter]
    > ; MIME type (plus optional parameters)
    >
    > pwg-parameter = ";" pl-level / pl-build / pl- prof /
    > pl-desc / pl-res
    > ; PWG standard parameter with a 'quoted-string' value
    >
    > Desc: Document format
    >
    > See: Section 5.1 in MIME Part One [RFC 2045]
    > for ABNF definition of 'type', 'subtype', and 'parameter'
    > See: Section 4.1.9 in IPP/1.1 Model and Semantics [RFC 2911]
    > for definition of 'mimeMediaType' syntax
    >
    >
    >
    >
    >
    > pl-level = "pwg-lang-level" "=" quoted-string
    >
    > Desc: Language level and/or version (not applicable for
    > 'text/plain').
    > Human-readable information, suitable for client UI and debug.
    > Not suitable for use by automata.
    >
    > See: Section 3.2.5 of Internet Message Formats [RFC 2822]
    > for ABNF definition of 'quoted-string'
    > See: Section 19 'The Interpreter Group' in Printer MIB [RFC 1759]
    > for definition of 'prtInterpreterLangLevel'
    >
    > Examples:
    >
    > application/postscript;pwg-lang-level="2" (Adobe PostScript Level 2)
    >
    > application/vnd.hp-pcl;pwg-lang-level="5e" (HP PCL 5e)
    >
    > application/vnd.cip4-jdf+xml;pwg-lang-level="1.1" (CIP4 JDF 1.1)
    >
    >
    >
    >
    >
    > pl-prof = "pwg-lang-profile" "=" quoted-string
    >
    > Desc: Language profile or subset (not applicable for 'text/plain').
    > Human-readable information, suitable for client UI and debug.
    > Not suitable for use by automata.
    >
    > See: Section 3.2.5 of Internet Message Formats [RFC 2822]
    > for ABNF definition of 'quoted-string'
    >
    > Examples:
    >
    > application/pdf;pwg-lang-profile="PDF-X3" (ISO-15930-3:2002)
    >
    >
    >
    >
    > pl-desc = "pwg-lang-desc" "=" quoted-string
    >
    > Desc: Language description.
    > Human-readable information, suitable for client UI and debug.
    > Not suitable for use by automata.
    >
    > Note: This parameter should be _last_, since embedded whitespace may
    > terminate 'cut-and-paste'.
    >
    > See: Section 3.2.5 of Internet Message Formats [RFC 2822]
    > for ABNF definition of 'quoted-string'
    > See: Section 19 'The Interpreter Group' in Printer MIB [RFC 1759]
    > for definition of 'prtInterpreterDescription'
    >
    > Examples:
    >
    > application/postscript;pwg-lang-desc="Adobe PostScript Level 2"
    >
    > application/vnd.hp-pcl;pwg-lang-desc="HP PCL Level 5e - 25 Oct 2002"
    >
    >
    >
    >
    > The following Interpreter attributes from the Printer MIB are omitted
    > (for the reasons noted below):
    >
    > 'prtInterpreterLangFamily'
    > - language family (i.e., simple MIME type)
    > - see 'document-format' in IPP/1.1 Model [RFC 2911]
    >
    > 'prtInterpreterLangVersion'
    > - language version
    > - ambiguous in actual usage with 'prtInterpreterLangLevel'
    >
    > 'prtInterpreterVersion'
    > - interpreter implementation build version info
    > - not applicable for MIME types
    >
    > 'prtInterpreterDefaultOrientation'
    > - default orientation (portrait or landscape)
    > - see "orientation-requested-default" in IPP/1.1 Model [RFC 2911]
    >
    > 'prtInterpreterDefaultCharSetIn'
    > - default input charset (to avoid charset 'guessing')
    > - see 'charset-configured' in IPP/1.1 Model [RFC 2911]
    >
    > 'prtInterpreterDefaultCharSetOut'
    > - default output charset,
    > - only useful for softcopy output (i.e., 'print-to-file')
    > - not supported in IPP/1.1 Model [RFC 2911]
    >
    > 'prtInterpreterTwoWay'
    > - indicates support for bidirectional print channel
    > - not supported in IPP/1.1 Model [RFC 2911]
    >
    >
    > --------------------------------------------------------------
    > ----------
    > [from "MIME Part Two: Media Types", RFC 2046]
    >
    > > Parameters are modifiers of the media subtype, and as such do not
    > > fundamentally affect the nature of the content. The set of
    > meaningful parameters depends on the media type and subtype. Most
    > parameters are associated with a single specific subtype.
    > However, a
    > given top-level media type may define parameters which are
    > applicable
    > to any subtype of that type. Parameters may be required by their
    > > defining media type or subtype or they may be optional. MIME
    > > implementations must also ignore any parameters whose names they do
    > > not recognize.
    >
    >
    > --------------------------------------------------------------
    > ----------
    > [from "Augmented BNF for Syntax Specifications (ANBF)", RFC 2234]
    >
    > DIGIT = %x30-39
    >
    > DQUOTE = %x22
    >
    >
    > --------------------------------------------------------------
    > ----------
    > [from "MIME Part One: Format of Internet Message Bodies", RFC 2045]
    >
    > parameter = attribute "=" value
    >
    > attribute = token
    > ; Matching of attributes
    > ; is ALWAYS case-insensitive.
    >
    > value = token / quoted-string
    >
    > token = 1*<any (US-ASCII) CHAR except SPACE, CTLs,
    > or tspecials>
    >
    > tspecials = "(" / ")" / "<" / ">" / "@" /
    > "," / ";" / ":" / "\" / <">
    > "/" / "[" / "]" / "?" / "="
    > ; Must be in quoted-string,
    > ; to use within parameter values
    >
    >
    > --------------------------------------------------------------
    > ----------
    > [from "Internet Message Format", RFC 2822]
    >
    > quoted-string = [CFWS]
    > DQUOTE *([FWS] qcontent) [FWS] DQUOTE
    > [CFWS]
    >
    > A quoted-string is treated as a unit. That is, quoted-string is
    > identical to atom, semantically. Since a quoted-string is
    > allowed to
    > contain FWS, folding is permitted. Also note that since
    > quoted-pair
    > is allowed in a quoted-string, the quote and backslash
    > characters may
    > appear in a quoted-string so long as they appear as a quoted-pair.
    >
    > Semantically, neither the optional CFWS outside of the quote
    > characters nor the quote characters themselves are part of the
    > quoted-string; the quoted-string is what is contained
    > between the two
    > quote characters. As stated earlier, the "\" in any
    > quoted-pair and
    > the CRLF in any FWS/CFWS that appears within the quoted-string are
    > semantically "invisible" and therefore not part of the
    > quoted-string
    > either.
    >
    > <...>
    >
    > CFWS = *([FWS] comment) (([FWS] comment) / FWS)
    >
    > FWS = ([*WSP CRLF] 1*WSP) / ; Folding white space
    > obs-FWS
    >
    > ctext = NO-WS-CTL / ; Non white space controls
    >
    > %d33-39 / ; The rest of the US-ASCII
    > %d42-91 / ; characters not
    > including "(",
    > %d93-126 ; ")", or "\"
    >
    > ccontent = ctext / quoted-pair / comment
    >
    > comment = "(" *([FWS] ccontent) [FWS] ")"
    >
    > Throughout this standard, where FWS (the folding white space token)
    > appears, it indicates a place where header folding, as discussed in
    > section 2.2.3, may take place. Wherever header folding
    > appears in a
    > message (that is, a header field body containing a CRLF followed by
    > any WSP), header unfolding (removal of the CRLF) is
    > performed before
    > any further lexical analysis is performed on that header field
    > according to this standard. That is to say, any CRLF that
    > appears in
    > FWS is semantically "invisible."
    >
    > A comment is normally used in a structured field body to
    > provide some
    > human readable informational text. Since a comment is allowed to
    > contain FWS, folding is permitted within the comment.
    > Also note that
    > since quoted-pair is allowed in a comment, the parentheses and
    > backslash characters may appear in a comment so long as they appear
    > as a quoted-pair. Semantically, the enclosing parentheses are not
    > part of the comment; the comment is what is contained
    > between the two
    > parentheses. As stated earlier, the "\" in any quoted-pair and the
    > CRLF in any FWS that appears within the comment are semantically
    > "invisible" and therefore not part of the comment either.
    >
    > Runs of FWS, comment or CFWS that occur between lexical tokens in a
    > structured field header are semantically interpreted as a single
    > space character.
    >
    > --------------------------------------------------------------
    > ----------
    >



    This archive was generated by hypermail 2b29 : Fri Nov 01 2002 - 13:48:26 EST