IPP Mail Archive: Re: IPP>PRO: sorry, but binary is better

Re: IPP>PRO: sorry, but binary is better

Robert Herriot (Robert.Herriot@Eng.Sun.COM)
Thu, 19 Jun 1997 18:38:55 -0700

Here is my version of GrabAttr for the protocol we chose on June 17th.
I think that it is competitive with the function you defined for the
binary protocol. Also, my ANSI C book states that atoi assumes decimal, but
recommends using strtol as I have done below.

int GrabAttr(ATTR *pAttr)
{
char * pNext;
int length;

pNext = strchr(pBuf,' ');
pAttr->nNameLength = pNext-pBuf;
pAttr->pName = pBuf
pBuf = pNext + 1;

length = strtol(pBuf,&pNext,10); /* ANSI C function */
pAttr->nValLen = length;
pNext++;
pAttr->pVal = pNext;
pBuf = pNext + length;
return ERROR_OK
}

As for wanting integer tokens for internal processing of keywords, I
would expect than an implementation might have a keywordToInt function
which would map keywords to integers that in turn could be used in
switch statements. So the strncmp/hashing issues would be kept in
the keywordToInt function.

Comments?

Bob Herriot

> From SBUTLER@hpbs2024.boi.hp.com Thu Jun 19 18:06:36 1997
>
> I just sent this to the IPP reflector, but forgot to put the CC in to
> you folk. It appears customery to address changes to the most
> recent active participants, and I thought you deserved a heads-up.
>
> I believe I gathered correct e-mail addresses for all attendees on
> June17th, please forward on to anyone that should see it directly
> rather than the reflector copy.
>
> ------- Forwarded Message Follows -------
>
> I'm sorry for the length of this missive (about 150 lines plus
> headers...) but it seems necessary.
>
> Last night I revised Paul's document to indicate what we had concluded
> on the 17th, and this morning I woke up way too early... Or perhaps it
> was way too late, depending on your perspective.
>
> A binary encoding is MUCH simpler. Even if limited to just the lengths.
> For example, with 16-bit binary lengths (FAIRLY COMPLETE CODE):
>
> // assumes enough incoming buffer to hold entire name and ValLen fields
> // requires external help to deal with long or multiple values
> int GrabAttr(ATTR *pAttr)
> {
> nLength=ntohs((unsigned short)*(U16 *)pBuf);
> pBuf+=2;
> pAttr->nNameLength=nLength;
> pAttr->pName=pBuf;
> pBuf+=nLength;
> nLength=ntohs((unsigned short)*(U16 *)pBuf);
> pBuf+=2;
> pAttr->nValLen=nLength;
> pAttr->pVal=pBuf;
> pBuf+=nLength;
> return ERROR_OK;
> }
>
> vs. a n-digit ASCII length (NOT CODE, not even pseudo-c):
>
> // assumes enough incoming buffer to hold entire name and ValLen fields
> // requires external help to deal with long or multiple values
> int GrabAttr(ATTR *pAttr)
> {
> // str functions don't work because we aren't null-terminated
> so we normalize all input buffers? (null terminate ... ?)
> // or just write our own strtok?
> strtok(ON A COPY of the string)
> // so we can remember the terminating char, either <SP> or <CR>...
> pAttr->pName=pToken;
> pBuf+=strlen(pToken)+1;
> strtok()
> pBuf+=strlen(pToken)+1;
> nLength=private_atoi (because leading 0's would assume octal)
> pAttr->nValLen=nLength;
> pAttr->pVal=pBuf;
> pBuf+=nLength;
> return pBuf;
> }
> int private_atoi(char *)
> {
> do it
> }
> void normalize() // or private_strtok
> {
> do it
> }
>
> Both of these examples will require a bunch of strncmp's in order to
> actually do anything, because what I've drawn up isn't even as binary
> as Paul and I had in the SWP documents (even June 6). If you reduce
> Operation and Attribute names to an enum'd set then all those strcmp's
> go away and become a simple '==' or even a switch (YES!).
>
> I think we need to face reality here... Binary requires significantly
> less code to deal with, which means less bugs and less testing, which
> means more solid implementations sooner. (Anybody ever run into a web
> site that used atoi and interpreted numbers as octal? I was just reading
> about one last week...). What do we get with ASCII? My list of "pros"
> from the meeting is really short. The biggest I've been able to
> identify is vendor extensibility.
>
> For extensibility we could reserve the upper 0x8000 (or maybe fewer).
> In fact, for attributes we could assign the "all ones" enum and have the
> vendor use the first 4+ bytes of the value as their unique attribute
> name/ID in order to minimize collisions.
>
> Unless someone has parsing for ASCII all worked out and can illustrate
> that it really isn't much more code, I have to be in the binary camp.
> IBM's triplets anyone? (Roger?)
>
> The following three examples show an encoding of a
> Print-Job operation with attributes
> job-name=="Spec"
> job-originator=="Sylvan", and
> (a multi-value vendor specific hypothetical attribute)
> vendorHWP-BLD-ID=="Alpha",32766
>
> In all examples the bytes on the wire are specified in hex (two
> characters) or ASCII (one character). The spaces between bytes and the
> line wrapping are not transmitted. {print data} is the actual data for
> the job and the literal phrase is not transmitted.
>
> -----------
> Example 1, ASCII (June17):
>
> 0 1 0 0 P r i n t - J o b 0d 0a
> j o b - n a m e 20 4 20 S p e c j o b - o r i g i n a
> t o r 20 6 20 S y l v a n H W P - B L D - I D 20 5 20 A
> l p h a 20 5 20 3 2 7 6 6 0d 0a
> {print data}
>
> -----------
> Example 2, ASCII with binary (version is fixed) lengths:
>
> 0 1 0 0 00 09 P r i n t - J o b 00 08
> j o b - n a m e 00 04 S p e c 00 0e j o b - o r i g i n
> a t o r 00 06 S y l v a n 00 0a H W P - B L D - I D 00 05
> A l p h a 00 00 00 05 3 2 7 6 6
> {print data}
>
> -----------
> Example 3, binary (SWP, IBM's triplets...)
> (extra spaces, line wraps, and comments added for clarity):
>
> 00 01 00 00 ; version 01.00
> 00 01 ; Print-Job
> 00 01 ; job-name
> 00 04 S p e c ; length and value
> 00 02 ; job-originator
> 00 06 S y l v a n ; length and value
> ff ff ; vendor--see 4+ bytes of value
> 00 0a H W P - B L D - I D ; length and value1
> 00 00 ; additional value for prev attribute
> 00 05 A l p h a ; length and value2
> 00 00 ; additional value
> 00 02 7f fe ; length and value3
> {print data}
>
> -----------
>
> Going from example1 to example2 eliminates scanning and tokenizing.
>
> Going from example2 to example3 eliminates strcmp's to match names
> of attributes and operations.
>
> I vote to go all binary, eg. #3. (Note though, that I encoded the HP
> attribute name in ASCII whereas in real life I'd probably pick more like:
> H W P xx yy zz
> where hex yyzz would increment but the rest would remain static for all
> the attributes that I create. I probably wouldn't do the multi-valued
> attribute either, rather I'd just append onto the first value since the
> prefix length is fixed.)
>
> I apologize for having to do this, but evidently my implementor hat
> wasn't fitting very well on June 17th.
>
> Thanks for your attention,
>
> sdb
>
> | Sylvan Butler | sbutler@boi.hp.com | AreaCode 208 Phone/TelNet 396-2282 |
>