attachment-0001
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<TITLE>Message</TITLE>
<META content="MSHTML 6.00.2715.400" name=GENERATOR></HEAD>
<BODY>
<DIV><SPAN class=280495411-05032003><FONT face=Arial color=#0000ff
size=2>Rick,</FONT></SPAN></DIV>
<DIV><SPAN class=280495411-05032003><FONT face=Arial color=#0000ff size=2>Why
not just increase the size of the length field signature? Could this be
done by the addition of data or comments in the length object or by adding
another object? I don't know pdf very well. I don't think we need 0%
probability of confusion just a statistically insignificant
chance.</FONT></SPAN></DIV>
<DIV><SPAN class=280495411-05032003><FONT face=Arial color=#0000ff
size=2>Pete</FONT></SPAN></DIV>
<DIV> </DIV>
<UL>
<UL>
<UL>
<UL>
<P><FONT face=Impact>Peter Zehler</FONT> <BR><FONT face=XeroxPeopleNet
color=#ff0000>XEROX</FONT> <BR><FONT face=Tahoma size=2>Xerox
Architecture Center</FONT> <BR><FONT face=Arial size=2>Email:
PZehler@crt.xerox.com</FONT> <BR><FONT face=Arial color=#000000
size=2>Voice: (585) 265-8755</FONT> <BR><FONT
face=Arial color=#000000 size=2>FAX: (585)
265-8871</FONT><FONT face=Arial size=2> </FONT><BR><FONT face=Arial
size=2>US Mail: Peter Zehler</FONT>
<UL>
<P><FONT face=Arial size=2>
Xerox Corp.</FONT> <BR><FONT face=Arial
size=2> 800 Phillips
Rd.</FONT> <BR><FONT face=Arial
size=2> M/S 128-30E</FONT>
<BR><FONT face=Arial size=2>
Webster NY, 14580-9701</FONT> </P></UL></UL></UL></UL></UL>
<BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader dir=ltr align=left><FONT face=Tahoma
size=2>-----Original Message-----<BR><B>From:</B> Rick Seeler
[mailto:rseeler@adobe.com]<BR><B>Sent:</B> Tuesday, March 04, 2003 1:29
PM<BR><B>To:</B> ifx@pwg.org<BR><B>Subject:</B> IFX> PDF/is
Issue.<BR><BR></FONT></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial size=2>During prototyping
of PDF/is the following problem arose:</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial size=2>How does the
Consumer know when the end of a data stream (See section 3.2.7 of
[pdf]) is reached? Normally, in a PDF, the Consumer would consult
the stream length field. The problem here is where to put the length
field. If the length were placed before the stream, the Consumer would
know how long the stream is. This requires the Producer to know the
stream's length before writing it to the Consumer. If, instead, the
length were written at the end of the stream, this would solve the Producer's
problem but the Consumer would not know how to find the length since they
can't identify, 100% of the time, where the stream ends and where the length
object is.</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial size=2>An example will
illustrate:</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial size=2>First, the normal
case...</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2>stream</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2>sdljfiwefnwfubrevurewliysnhr;hgawebfz;h;uwre (lots of binary data
here)....</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2>84trhdvfyu7wgf4.nbdrgur4uaru4gb</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2>endstream</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial size=2>12 0
obj</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2>3456 <- the length of the previous
stream.</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2>endobj</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial size=2>But, what if the
data looked like this...</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial size=2>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2>stream</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2>sdljfiwefnwfubrevurewliysnhr;hgawebfz;h;uwre (lots of binary data
here)....</FONT></SPAN></DIV>
<DIV><SPAN
class=358045117-04032003>endstream
<- the binary data could have a string of bytes that looked like
this.</SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2>84trhdvfyu7wgf4.nbdrgur4uaru4gb</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2>endstream</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial size=2>12 0
obj</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2>4567 <- the length of the previous
stream.</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2>endobj</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003></SPAN> </DIV>
<DIV><SPAN class=358045117-04032003>Of course, you could look to bytes after
the appearance of the word 'endstream' to see if this is really the end of the
stream; but you can always come up with a stream that could match your parsing
algorithm's expectations (although with decreasing percentage of
occurrence).</SPAN></DIV></FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial size=2>Possible
solutions:</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial size=2>1) Write all data
using ASCII85 encoding (See Section 3.3.2 of [pdf]). This will increase
stream lengths by 25%. ASCII85 has a stream delimiter which would solve
this problem -- the end of the stream can be known for certain and the length
field can be placed after the stream.</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial size=2>2) Require the
Producer to write the stream length before any stream (the streams would stay
binary). The Producer can use banding to break up large images into
small enough chunks so the Producer can cache the stream before
sending.</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial size=2>3) Offer a
combination of 1 & 2. The Producer would cache streams if possible,
but may use ASCII85, if necessary.</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial size=2>4) Producer must
make certain all streams must not contain a series of bytes
"\0D\0Aendstream" in the stream data. This is how the spec is defined
currently -- but this may be too onerous for the Producer.</FONT></SPAN></DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=358045117-04032003><FONT face=Arial size=2>Any other
ideas? I'm personally leaning toward solution #3.</FONT></SPAN></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV><!-- Converted from text/plain format -->
<P><FONT size=2>-Rick<BR></FONT></P>
<DIV><FONT face=Arial size=2></FONT> </DIV></BLOCKQUOTE></BODY></HTML>