IPP Mail Archive: Re: IPP>PRO: protocol problem, WAS: sorry, but binary is better

Re: IPP>PRO: protocol problem, WAS: sorry, but binary is better

Tom Hastings (hastings@cp10.es.xerox.com)
Wed, 25 Jun 1997 01:06:06 PDT

It seems to me that trying to parse the data out of a buffer is too
unreliable and/or forces us to put CRLF (with all its variations,
and/or tranformations that might invalidate the length in length-prefixed
strings) to be considered a robust way to program IPP. The number of
attributes and the speed requirements are not sufficient to justify
attempting such a risky optimization. And the performance improvement
couldn't possibly measured with respect to a single POST, let alone that
document data. Printers can't print jobs any faster than they can print
pages. Even a 240 page per minute printer, can't be accepting jobs
any faster than 4 per second! And the faster the printer, the longer the
jobs.

We've probably spend more human time on this issue that will be saved
by all the implementations of IPP.

Simply copy characters from the buffer and form proper tokens. If a token
crosses a buffer boundary, advance to the next buffer and continue copying
characters to form the token.

The advantages of text tokens have been expressed by Larry (promoting private
to public attributes, and debugging) and the advantages of length-prefixed
strings is in extensibility to more complex data types, such as sets of
attributes, without impacting V1.0 spec and V1.0 implementations.

Lets keep text and length-prefixed strings and don't attempt to parse
buffers.

Tom

At 14:37 06/23/97 PDT, Robert Herriot wrote:
>After thinking about the partial buffer problem with respect to the
>three possible encodings, the ONLY ENCODING THAT SEEMS TO HAVE A PROBLEM
>IS THE ONE WE CHOSE. That is why we need to continue the discussion.
>
>In my last email I described the algorithm for the HTTP-like version
>which fills a buffer upto CRLF, thus avoiding buffer overlap issues.
>
>With the binary encoding, I would expect that it would fread the two
>byte length (call it n) and then fread n bytes of name/value in order
>to avoid messy buffer overlaps. It would do this twice for each
>parameter. So the reading of bytes would be intimately associated with
>the parsing. But Sylvan's GrabAttr seems to assume a buffer that
>already has the name and the attribute. How does Sylvan do this
>without parsing the two lengths at the high level and in GrabAttr.
>
>Sylvan, could you give more details about how you handle the buffer
>overlap problem with binary encoding? I have suggested a method
>above, but you seem to have some other way.
>
>Bob Herriot
>
>> From SBUTLER@hpbs2024.boi.hp.com Mon Jun 23 09:04:34 1997
>> From: "Sylvan Butler" <SBUTLER@hpbs2024.boi.hp.com>
>> X-Real-Sender: SYLVAN
>> Organization: Hewlett-Packard, Boise
>> To: Robert.Herriot@Eng (Robert Herriot)
>> Date: Mon, 23 Jun 1997 10:02:48 -0700
>> Subject: Re: IPP>PRO: sorry, but binary is better
>> CC: ipp@pwg.org
>> Priority: normal
>> X-mailer: Pegasus Mail v3.31
>> Content-Length: 2098
>> X-Lines: 53
>>
>> >This buffer problem exists for both the current IPP proposal and for
>> >the previous binary proposal that you prefer. The two byte binary
>> >integer could span buffers and a parameter name or value buffer could
>>
>> The upper layer can trivially ensure that the buffer contains X bytes
>> of data, where X is large enough to hold at least a two byte attribute
>> name and a two byte value length. Once the lengths are known then
>> the proper amount for values is easy to collect.
>>
>> With an ASCII encoding the upper layer has to guess, or check for
>> proper terminators.
>>
>> >buffers seems rather messy to me because the break can occur during any
>> >element. It sees like a good place to generate lots of occasional bugs.
>>
>> Yes, except with binary it is not messy at all.
>>
>> >putting a " " (space character) just after the last character in the
>> >buffer to stop the scan-for-space algorithm.
>>
>> That would work, if you must scan.
>>
>> >You also observed that with CRLF and a maximum line length, it is easy
>> >to ensure that a buffer has a full unit because functions, such as fgets
>> >read upto the next CRLF. Perhaps we need to revisit the solution that
>>
>> Unless you have to interoperate with implementations that visually
>> checked (instead of looking at a sniffer hex dump) for "lines". Like
>> with HTTP servers today, even though the spec reads CRLF you still
>> find CR (from mac?), LF (from unix?) and the desired CRLF (I wouldn't
>> be suprised to find some LFCR's out there).
>>
>> > CR in a value is represented by =0C
>> > LF in a value is represented by =0A
>> > = in a value is represented by =3D
>>
>> You will probably need to escape anything less than %x20 and perhaps
>> anything greater than %x7E.
>>
>> In fact, instead of developing our own escape rules we should, as you
>> probably did, just pick one.
>>
>> This is just so ugly, considering the purpose is for two computers to
>> talk to each other over a link that is 8-bit clean.
>>
>> >Perhaps this pure textual solution is easier to program once you consider
>> >the buffer problem.
>>
>> I don't find it so.
>>
>> sdb
>>
>> | Sylvan Butler | sbutler@boi.hp.com | AreaCode 208 Phone/TelNet 396-2282 |
>>
>
>