IFX Mail Archive: IFX> PDF/is Issue.

IFX> PDF/is Issue.

From: Rick Seeler (rseeler@adobe.com)
Date: Tue Mar 04 2003 - 13:28:38 EST

  • Next message: Zehler, Peter: "RE: IFX> PDF/is Issue."

    During prototyping of PDF/is the following problem arose:
     
    How does the Consumer know when the end of a data stream (See section 3.2.7 of
    [pdf]) is reached? Normally, in a PDF, the Consumer would consult the stream
    length field. The problem here is where to put the length field. If the length
    were placed before the stream, the Consumer would know how long the stream is.
    This requires the Producer to know the stream's length before writing it to the
    Consumer. If, instead, the length were written at the end of the stream, this
    would solve the Producer's problem but the Consumer would not know how to find
    the length since they can't identify, 100% of the time, where the stream ends
    and where the length object is.
     
    An example will illustrate:
    First, the normal case...
     
    stream
    sdljfiwefnwfubrevurewliysnhr;hgawebfz;h;uwre (lots of binary data here)....
    84trhdvfyu7wgf4.nbdrgur4uaru4gb
    endstream
    12 0 obj
    3456 <- the length of the previous stream.
    endobj
     
    But, what if the data looked like this...
     
    stream
    sdljfiwefnwfubrevurewliysnhr;hgawebfz;h;uwre (lots of binary data here)....
    endstream <- the binary data could have a string of bytes that looked
    like this.
    84trhdvfyu7wgf4.nbdrgur4uaru4gb
    endstream
    12 0 obj
    4567 <- the length of the previous stream.
    endobj
     
    Of course, you could look to bytes after the appearance of the word 'endstream'
    to see if this is really the end of the stream; but you can always come up with
    a stream that could match your parsing algorithm's expectations (although with
    decreasing percentage of occurrence).
     
    Possible solutions:
    1) Write all data using ASCII85 encoding (See Section 3.3.2 of [pdf]). This
    will increase stream lengths by 25%. ASCII85 has a stream delimiter which would
    solve this problem -- the end of the stream can be known for certain and the
    length field can be placed after the stream.
    2) Require the Producer to write the stream length before any stream (the
    streams would stay binary). The Producer can use banding to break up large
    images into small enough chunks so the Producer can cache the stream before
    sending.
    3) Offer a combination of 1 & 2. The Producer would cache streams if possible,
    but may use ASCII85, if necessary.
    4) Producer must make certain all streams must not contain a series of bytes
    "\0D\0Aendstream" in the stream data. This is how the spec is defined currently
    -- but this may be too onerous for the Producer.
     
    Any other ideas? I'm personally leaning toward solution #3.
     

    -Rick

     



    This archive was generated by hypermail 2b29 : Tue Mar 04 2003 - 13:28:50 EST