IFX Mail Archive: RE: IFX> PDF/is Issue.

RE: IFX> PDF/is Issue.

From: Rick Seeler (rseeler@adobe.com)
Date: Thu Mar 06 2003 - 14:26:14 EST

  • Next message: Rick Seeler: "RE: IFX> PDF/is Issue."

    Rob,
    I agree.... Using ASCII85 is not desirable.

    One other option that I failed to list:

    We could leave the stream length as an indirect object reference to a object
    after the stream (as it is now) and require that the stream be decoded to
    determine its actual length. Since putting the length after the stream would
    only need to apply to streams that are encoded with CCITTDecode, DCTDecode, or
    JBIG2Decode; this should work. Decoding of the image data will determine the
    size of the stream data. A problem would occur, of course, if the Consumer did
    not understand the format of the image and wished to bypass the stream (why
    would they do this?).

    -Rick

    > -----Original Message-----
    > From: Buckley, Robert R [mailto:RBuckley@crt.xerox.com]
    > Sent: Thursday, March 06, 2003 7:21 AM
    > To: 'Rick Seeler'
    > Cc: ifx@pwg.org
    > Subject: RE: IFX> PDF/is Issue.
    > Importance: High
    >
    >
    > Rick et al.,
    >
    > I would go with #2. In fact in our prototyping, we include
    > the codestream length explicitly in the image object, rather
    > than as an indirect object reference, to address the problem
    > you describe. I would not support #1, simply because it would
    > increase file size.
    >
    > Rob
    >
    > -----Original Message-----
    > From: Rick Seeler [mailto:rseeler@adobe.com]
    > Sent: Tuesday, March 04, 2003 1:29 PM
    > To: ifx@pwg.org
    > Subject: IFX> PDF/is Issue.
    >
    >
    > During prototyping of PDF/is the following problem arose:
    >
    > How does the Consumer know when the end of a data stream (See
    > section 3.2.7 of [pdf]) is reached? Normally, in a PDF, the
    > Consumer would consult the stream length field. The problem
    > here is where to put the length field. If the length were
    > placed before the stream, the Consumer would know how long
    > the stream is. This requires the Producer to know the
    > stream's length before writing it to the Consumer. If,
    > instead, the length were written at the end of the stream,
    > this would solve the Producer's problem but the Consumer
    > would not know how to find the length since they can't
    > identify, 100% of the time, where the stream ends and where
    > the length object is.
    >
    > An example will illustrate:
    > First, the normal case...
    >
    > stream
    > sdljfiwefnwfubrevurewliysnhr;hgawebfz;h;uwre (lots of binary
    > data here).... 84trhdvfyu7wgf4.nbdrgur4uaru4gb endstream 12 0 obj
    > 3456 <- the length of the previous stream.
    > endobj
    >
    > But, what if the data looked like this...
    >
    >
    > stream
    > sdljfiwefnwfubrevurewliysnhr;hgawebfz;h;uwre (lots of binary
    > data here)....
    > endstream <- the binary data could have a string
    > of bytes that
    > looked like this.
    > 84trhdvfyu7wgf4.nbdrgur4uaru4gb
    > endstream
    > 12 0 obj
    > 4567 <- the length of the previous stream.
    > endobj
    >
    > Of course, you could look to bytes after the appearance of
    > the word 'endstream' to see if this is really the end of the
    > stream; but you can always come up with a stream that could
    > match your parsing algorithm's expectations (although with
    > decreasing percentage of occurrence).
    >
    > Possible solutions:
    > 1) Write all data using ASCII85 encoding (See Section 3.3.2
    > of [pdf]). This will increase stream lengths by 25%.
    > ASCII85 has a stream delimiter which would solve this problem
    > -- the end of the stream can be known for certain and the
    > length field can be placed after the stream.
    > 2) Require the Producer to write the stream length before any
    > stream (the streams would stay binary). The Producer can use
    > banding to break up large images into small enough chunks so
    > the Producer can cache the stream before sending.
    > 3) Offer a combination of 1 & 2. The Producer would cache
    > streams if possible, but may use ASCII85, if necessary.
    > 4) Producer must make certain all streams must not contain a
    > series of bytes "\0D\0Aendstream" in the stream data. This
    > is how the spec is defined currently -- but this may be too
    > onerous for the Producer.
    >
    > Any other ideas? I'm personally leaning toward solution #3.
    >
    >
    > -Rick
    >
    >
    >
    >



    This archive was generated by hypermail 2b29 : Thu Mar 06 2003 - 14:26:28 EST