attachment-0001
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 10 (filtered)">
<title>Message</title>
<style>
<!--
/* Font Definitions */
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:Impact;
        panose-1:2 11 8 6 3 9 2 5 2 4;}
@font-face
        {font-family:sans-serif;
        panose-1:0 0 0 0 0 0 0 0 0 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman";}
a:link, span.MsoHyperlink
        {color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {color:purple;
        text-decoration:underline;}
p
        {margin-right:0in;
        margin-left:0in;
        font-size:12.0pt;
        font-family:"Times New Roman";}
span.EmailStyle18
        {font-family:Arial;
        color:navy;}
@page Section1
        {size:8.5in 11.0in;
        margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
        {page:Section1;}
-->
</style>
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>But one is going to have to modify the
existing writers in order to be PDF/is compliant so at least this argument
shouldn’t be an issue.</span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'> </span></font></p>
<p class=MsoNormal style='margin-left:.5in'><font size=2 face=Tahoma><span
style='font-size:10.0pt;font-family:Tahoma'>-----Original Message-----<br>
<b><span style='font-weight:bold'>From:</span></b> Poysa, Kari
[mailto:Kari.Poysa@usa.xerox.com] <br>
<b><span style='font-weight:bold'>Sent:</span></b> Wednesday, March 12, 2003
6:04 AM<br>
<b><span style='font-weight:bold'>To:</span></b> Hastings, Tom N; 'Rick
Seeler'; 'Carl Kugler'<br>
<b><span style='font-weight:bold'>Cc:</span></b> ifx@pwg.org<br>
<b><span style='font-weight:bold'>Subject:</span></b> RE: IFX> PDF/is Issue.</span></font></p>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>Tom, The Length being
discussed here actually is the byte count of the streams of Image
XObjects that belong to the Page. So if the Page is comprised of
more than one image (a.k.a banding), then the sender does not need to cache
even a full page's worth of compressed data in order to be able to write the
Image XObject's stream length in the stream dictionary.</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>Full PDF allows the
writer to enter an indirect object reference into the required Length entry.
This makes it easy to implement writers because the separate object for the
length can be written after all of the image data has been written. The PDF
files are then read in the reverse order starting from the end of the file.
This works well if one has a file system to store the complete PDF file.
So requiring the Length to be a direct value in the stream dictionary most
likely would cause existing writer SW to have to be modified. One could
not keep writing the same kind of files and claim them PDF/is compliant.</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'> ---
Kari ---</span></font></p>
</div>
<blockquote style='margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt'>
<p class=MsoNormal style='margin-right:0in;margin-bottom:12.0pt;margin-left:
.5in'><font size=2 face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma'>-----Original
Message-----<br>
<b><span style='font-weight:bold'>From:</span></b> Hastings, Tom N <br>
<b><span style='font-weight:bold'>Sent:</span></b> Tuesday, March 11, 2003 5:49
PM<br>
<b><span style='font-weight:bold'>To:</span></b> Poysa, Kari; 'Rick Seeler';
'Carl Kugler'<br>
<b><span style='font-weight:bold'>Cc:</span></b> ifx@pwg.org<br>
<b><span style='font-weight:bold'>Subject:</span></b> RE: IFX> PDF/is Issue.</span></font></p>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>Kari,</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>I think you summed up the
argument about tradeoff simply between the Sender and the Receiver when you
said:</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>"If we require the
reader to be able to cache a page's worth of uncompressed data, surely we can
require the writer to cache a page's worth of compressed data [in order to
determine the length and send that length in the stream]."</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>I assume that PDF has the
notion of a length for each page, right? So we require that the Sender
put in a length field for each page of data at the front of each page of
data. Can that length field be sent with the data in some manner, so that
the Sender doesn't have to know the lengths of all of the pages before sending
any?</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>Tom</span></font></p>
</div>
<blockquote style='margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt'>
<p class=MsoNormal style='margin-right:0in;margin-bottom:12.0pt;margin-left:
.5in'><font size=2 face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma'>-----Original
Message-----<br>
<b><span style='font-weight:bold'>From:</span></b> Poysa, Kari
[mailto:Kari.Poysa@usa.xerox.com]<br>
<b><span style='font-weight:bold'>Sent:</span></b> Friday, March 07, 2003 15:04<br>
<b><span style='font-weight:bold'>To:</span></b> 'Rick Seeler'; 'Carl Kugler'<br>
<b><span style='font-weight:bold'>Cc:</span></b> ifx@pwg.org<br>
<b><span style='font-weight:bold'>Subject:</span></b> RE: IFX> PDF/is Issue.</span></font></p>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>Rick, I bet this solution
can be implemented, but it does have some problems for the reader that
unfortunately I did not see earlier. The difficulty really is whether we want
to make life easy for the streaming writer or the reader. </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>If the length follows the
image stream, the reader must scan the filtered stream to find the end of the
stream. This can make the reader implementation both cumbersome and slow,
especially if the stream has to be fully decoded during the PDF file parsing,
instead of simply extracting the correct amount of binary data and passing it
to a separate decompression module. The PDF file parser would have to know
details of the compressed streams which should really be of no interest to the
PDF file parser module and makes creating applications from 3rd party
components harder.</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>In addition, if the
reader attempts to decode the stream, how much data should be cached and
decoded at a time? If the end of stream is not found at first attempt, one has
to pass additional data to the decoder and continue decoding from where
previous data ended. This can delay achieving robust implementations. The
alternative, searching for the "endstream" text, is not 100% reliable
(although very close) and is a wasted step since no decompression is achieved
yet.</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>This issue is really at
the heart of what "streamable" means, and also has a big impact on
what kind of low resource applications PDF/is can be used for. I think we
should consider it a "MUST" for the writer to prefix the stream with
its length, since the goal is to make the file format streamable especially at
a low resource reader. If we require the reader to be able to cache a page's
worth of uncompressed data, surely we can require the writer to cache a page's
worth of compressed data.</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>I do understand Ira
McDonalds note about streaming writers (see separate Email). Possibly this
issue whether to prefix or postfix image streams with their lengths should be a
negotiable capability between the sender and receiver?</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'> ---
Kari ---</span></font></p>
</div>
<blockquote style='margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt'>
<p class=MsoNormal style='margin-right:0in;margin-bottom:12.0pt;margin-left:
.5in'><font size=2 face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma'>-----Original
Message-----<br>
<b><span style='font-weight:bold'>From:</span></b> Rick Seeler
[mailto:rseeler@adobe.com]<br>
<b><span style='font-weight:bold'>Sent:</span></b> Thursday, March 06, 2003
2:37 PM<br>
<b><span style='font-weight:bold'>To:</span></b> 'Poysa, Kari'; 'Carl Kugler'<br>
<b><span style='font-weight:bold'>Cc:</span></b> ifx@pwg.org<br>
<b><span style='font-weight:bold'>Subject:</span></b> RE: IFX> PDF/is Issue.</span></font></p>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>Kari,</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>Yes, the stream length
should precede the stream, if possible (this is allowed). But, in the
case where the stream may be long, this may not be possible for the
Producer. In that case, the length should be an indirect object reference
to the length that should come immediately after the stream.</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>As for your idea of
scanning for "endstream" that's followed by the size object.
This still has the same problem as scanning for "endstream" but just
has more data and a smaller likelihood of occurrence.</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>Given that, and what I
discussed in my previous e-mail on this subject (to Rob Buckley), I think the
best approach might be to:</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>1) The Producer MUST
always write the stream length of all 'Content Streams' and 'ICC Profile'
streams immediately in the object dictionary (before the stream).</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>2) When writing
image streams, the Producer MAY either write the stream length before
or after the stream, as they prefer.</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>3) When an image stream
is length succeeded (indirect object), the Consumer SHOULD decode image streams
to determine the stream length, when possible. But, the
Consumer MAY (at their peril) scan for the 'endstream' marker.</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>How does this sound as a
solution?</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<!-- Converted from text/plain format -->
<p style='margin-left:.5in'><font size=2 face="Times New Roman"><span
style='font-size:10.0pt'>-Rick</span></font></p>
<blockquote style='border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt;
margin-left:3.75pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt'>
<p class=MsoNormal style='margin-right:0in;margin-bottom:12.0pt;margin-left:
.5in'><font size=2 face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma'>-----Original
Message-----<br>
<b><span style='font-weight:bold'>From:</span></b> owner-ifx@pwg.org
[mailto:owner-ifx@pwg.org] <b><span style='font-weight:bold'>On Behalf Of </span></b>Poysa,
Kari<br>
<b><span style='font-weight:bold'>Sent:</span></b> Thursday, March 06, 2003
7:15 AM<br>
<b><span style='font-weight:bold'>To:</span></b> 'Carl Kugler'<br>
<b><span style='font-weight:bold'>Cc:</span></b> ifx@pwg.org<br>
<b><span style='font-weight:bold'>Subject:</span></b> RE: IFX> PDF/is Issue.</span></font></p>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>In my opinion the goal
should be to write the stream length immediately to the stream dictionary. </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>Also, the likelihood of
"endofstream" to exists in the data is small. We could also
require that if a low resource streaming writer is not able to add the length
directly into the stream directory, then the PDF object for the length MUST
immediately follow the stream object. This way, the reader can scan for
"endofstream" (but of course only if the length was not in the stream
dictionary) and make sure that it is the correct "endofstream" by
verifying that it is immediately followed by something that looks like a length
object. Could reader implementers comment on this?</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'>I think introducing an
additional filter like ASCII85 just for spotting the end of stream adds
unnecessary complexity to both writer and reader, increases file sizes and also
requires more memory and processing as the stream cannot be passed directly to
a decompressor.</span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</div>
<div>
<p class=MsoNormal style='margin-left:.5in'><font size=2 color=blue face=Arial><span
style='font-size:10.0pt;font-family:Arial;color:blue'> ---
Kari ---</span></font></p>
</div>
<blockquote style='margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt'>
<p class=MsoNormal style='margin-right:0in;margin-bottom:12.0pt;margin-left:
.5in'><font size=2 face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma'>-----Original
Message-----<br>
<b><span style='font-weight:bold'>From:</span></b> Carl Kugler [mailto:kugler@us.ibm.com]<br>
<b><span style='font-weight:bold'>Sent:</span></b> Wednesday, March 05, 2003
10:50 AM<br>
<b><span style='font-weight:bold'>Cc:</span></b> ifx@pwg.org<br>
<b><span style='font-weight:bold'>Subject:</span></b> RE: IFX> PDF/is Issue.</span></font></p>
<p class=MsoNormal style='margin-right:0in;margin-bottom:12.0pt;margin-left:
.5in'><font size=3 face="Times New Roman"><span style='font-size:12.0pt'><br>
</span></font><font size=2 face=sans-serif><span style='font-size:10.0pt;
font-family:sans-serif'>I like the chunking approach. It is efficient,
reliable, and has low overhead for reasonably sized chunks. Also fits
well in a typical implementation that writes a chunk of data at a time.</span></font>
<br>
<br>
<font size=2 face=sans-serif><span style='font-size:10.0pt;font-family:sans-serif'>
-Carl</span></font> <br>
<br>
<br>
</p>
<table class=MsoNormalTable border=0 cellpadding=0 width="100%"
style='width:100.0%;margin-left:.5in'>
<tr>
<td valign=top style='padding:.75pt .75pt .75pt .75pt'>
<p class=MsoNormal><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'> </span></font></p>
</td>
<td valign=top style='padding:.75pt .75pt .75pt .75pt'>
<p class=MsoNormal><b><font size=1 face=sans-serif><span style='font-size:
7.5pt;font-family:sans-serif;font-weight:bold'>"Zehler, Peter"
<PZehler@crt.xerox.com></span></font></b> <br>
<font size=1 face=sans-serif><span style='font-size:7.5pt;font-family:sans-serif'>Sent
by: owner-ifx@pwg.org</span></font> </p>
<p><font size=1 face=sans-serif><span style='font-size:7.5pt;font-family:
sans-serif'>03/05/2003 05:00 AM</span></font> </p>
</td>
<td valign=top style='padding:.75pt .75pt .75pt .75pt'>
<p class=MsoNormal><font size=1 face=Arial><span style='font-size:7.5pt;
font-family:Arial'> </span></font><br>
<font size=1 face=sans-serif><span style='font-size:7.5pt;font-family:sans-serif'>
To: "'Rick Seeler'"
<rseeler@adobe.com>, ifx@pwg.org</span></font> <br>
<font size=1 face=sans-serif><span style='font-size:7.5pt;font-family:sans-serif'>
cc: </span></font> <br>
<font size=1 face=sans-serif><span style='font-size:7.5pt;font-family:sans-serif'>
Subject: RE: IFX> PDF/is
Issue.</span></font> </p>
</td>
</tr>
</table>
<p class=MsoNormal style='margin-left:.5in'><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'><br>
<br>
<br>
</span></font><font size=2 color=blue face=Arial><span style='font-size:10.0pt;
font-family:Arial;color:blue'>Rick,</span></font> <br>
<font size=2 color=blue face=Arial><span style='font-size:10.0pt;font-family:
Arial;color:blue'>Why not just increase the size of the length field signature?
Could this be done by the addition of data or comments in the length
object or by adding another object? I don't know pdf very well. I
don't think we need 0% probability of confusion just a statistically
insignificant chance.</span></font> <br>
<font size=2 color=blue face=Arial><span style='font-size:10.0pt;font-family:
Arial;color:blue'>Pete</span></font> <br>
</p>
<p style='margin-left:.5in'><font size=3 face=Impact><span style='font-size:
12.0pt;font-family:Impact'>Peter Zehler</span></font> <font color=red><span
style='color:red'><br>
XEROX</span></font> <font size=2 face=Tahoma><span style='font-size:10.0pt;
font-family:Tahoma'><br>
Xerox Architecture Center</span></font> <font size=2 face=Arial><span
style='font-size:10.0pt;font-family:Arial'><br>
Email: PZehler@crt.xerox.com</span></font> <font size=2 face=Arial><span
style='font-size:10.0pt;font-family:Arial'><br>
Voice: (585) 265-8755</span></font> <font size=2 face=Arial><span
style='font-size:10.0pt;font-family:Arial'><br>
FAX: (585) 265-8871 <br>
US Mail: Peter Zehler</span></font> </p>
<p style='margin-left:.5in'><font size=2 face=Arial><span style='font-size:
10.0pt;font-family:Arial'> Xerox Corp.</span></font>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'><br>
800 Phillips Rd.</span></font> <font size=2
face=Arial><span style='font-size:10.0pt;font-family:Arial'><br>
M/S 128-30E</span></font> <font size=2 face=Arial><span
style='font-size:10.0pt;font-family:Arial'><br>
Webster NY, 14580-9701</span></font> </p>
<p style='margin-left:.5in'><font size=2 face=Tahoma><span style='font-size:
10.0pt;font-family:Tahoma'>-----Original Message-----<b><span style='font-weight:
bold'><br>
From:</span></b> Rick Seeler [mailto:rseeler@adobe.com]<b><span
style='font-weight:bold'><br>
Sent:</span></b> Tuesday, March 04, 2003 1:29 PM<b><span style='font-weight:
bold'><br>
To:</span></b> ifx@pwg.org<b><span style='font-weight:bold'><br>
Subject:</span></b> IFX> PDF/is Issue.<br>
</span></font><br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>During
prototyping of PDF/is the following problem arose:</span></font> <br>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>How
does the Consumer know when the end of a data stream (See section 3.2.7 of
[pdf]) is reached? Normally, in a PDF, the Consumer would consult the
stream length field. The problem here is where to put the length field.
If the length were placed before the stream, the Consumer would know how
long the stream is. This requires the Producer to know the stream's length
before writing it to the Consumer. If, instead, the length were written
at the end of the stream, this would solve the Producer's problem but the
Consumer would not know how to find the length since they can't identify, 100%
of the time, where the stream ends and where the length object is.</span></font>
<br>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>An
example will illustrate:</span></font> <br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>First,
the normal case...</span></font> <br>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>stream</span></font>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>sdljfiwefnwfubrevurewliysnhr;hgawebfz;h;uwre
(lots of binary data here)....</span></font> <br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>84trhdvfyu7wgf4.nbdrgur4uaru4gb</span></font>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>endstream</span></font>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>12 0
obj</span></font> <br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>3456
<- the length of the previous stream.</span></font> <br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>endobj</span></font>
<br>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>But,
what if the data looked like this...</span></font> <br>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>stream</span></font>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>sdljfiwefnwfubrevurewliysnhr;hgawebfz;h;uwre
(lots of binary data here)....</span></font> <br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>endstream
<- the binary data could have a
string of bytes that looked like this.</span></font> <br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>84trhdvfyu7wgf4.nbdrgur4uaru4gb</span></font>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>endstream</span></font>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>12 0
obj</span></font> <br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>4567
<- the length of the previous stream.</span></font> <br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>endobj</span></font>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'> </span></font>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>Of
course, you could look to bytes after the appearance of the word 'endstream' to
see if this is really the end of the stream; but you can always come up with a
stream that could match your parsing algorithm's expectations (although with
decreasing percentage of occurrence).</span></font> <br>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>Possible
solutions:</span></font> <br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>1)
Write all data using ASCII85 encoding (See Section 3.3.2 of [pdf]). This
will increase stream lengths by 25%. ASCII85 has a stream delimiter which
would solve this problem -- the end of the stream can be known for certain and
the length field can be placed after the stream.</span></font> <br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>2)
Require the Producer to write the stream length before any stream (the streams
would stay binary). The Producer can use banding to break up large images
into small enough chunks so the Producer can cache the stream before sending.</span></font>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>3)
Offer a combination of 1 & 2. The Producer would cache streams if
possible, but may use ASCII85, if necessary.</span></font> <br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>4)
Producer must make certain all streams must not contain a series of bytes
"\0D\0Aendstream" in the stream data. This is how the spec is
defined currently -- but this may be too onerous for the Producer.</span></font>
<br>
<br>
<font size=2 face=Arial><span style='font-size:10.0pt;font-family:Arial'>Any
other ideas? I'm personally leaning toward solution #3.</span></font> <br>
</p>
<p style='margin-left:.5in'><font size=2 face="Times New Roman"><span
style='font-size:10.0pt'>-Rick</span></font> </p>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</div>
</body>
</html>