IPP> PRO - A value-type field for the Protocol Spec proposa

Tue Jun 17 02:47:59 EDT 1997

This mail is a summary of the advantages of adding a data type code to our 
encoding is listed here.  This proposal assumes that we are still using UTF8
for 
representing all attribute values, even if we have a type code.

Each advantage is a small gain, but putting all the advantages together
makes a significant enough gain to make it worth including in IPP.

Advantages:

1. Makes it easier to separate the protocol parser from the code that
takes action on each attribute received.

2. True for client parsing a response, as well as a serve parsing a request.

3. The parser need only know about the data types, not about which
attributes are supported.

4. Much easier to make an adaptable client that queries a new Printer and
displays the user's choices for attributes that the client has never seen
before (only the data type code need by recognized).

5. The parser should be faster and smaller, since it only has to know
about the data types, not all the attributes.  The parser the provides
a "framework" into which an implementor can plug in an action routine for
each attribute.

6. Instead of having several attributes that differ only in data type, a
single attribute can be defined that takes several data types.  For example,
there could be a dateAndTime data type and just a Time data type.  Only one
attribute need be defined that supports both data types.  Another example
would be inches and millimeters.  Or the IPP "printer-speed" attribute which
has five built-in units.  Or an attribute that can either take a value or
be a URI to the value.  The actual data type code in the attribute instance
would indicate whether the data type was, say, text or URI.

7. One data type can be a "setOfAttributes". An implmentation that doesn't
implement this data type, could still skip over the entire value.  Then we
don't need to keep inventing ad hoc data types for attributes whose values
are structures.  The "setOfAttributes" data type will suffice for all
such structures.  IPP 2.0 and future registrations will need such a grouping
mechanism for some attributes.

8. Another data type can be an "indirect pointer" to the value of another
attribute when it is important that two attributes keep the same value.
This feature is probably more important for IPP V2.0 for administrative
functions, where the adinistrator wants several attributes to share the
same values.

Tom

At 10:54 06/06/97 PDT, Tom Hastings wrote:
>I agree with Steve's suggestion to add a data type code to the encoding
>of attributes in IPP and Brian's feedback.
>
>We have concluded an internal review of IPP at Xerox, and adding a
>data type code was the most important issue raised.  We have had
>a year and a half experience with an ASCII encoded attribute representation
>that includes the data type.  In our implementation, the data type
>was a keyword too, but a data type code is sufficient for IPP, where
>each of the data type keywords is assigned a one-octet code.
>
>I don't think we need to assign new data type codes for binary integer
>though.  We need to keep the data types as simple and small in number
>as possible.
>
>The main reason that we included the data type is that it makes the
>implementation of the parser on the recipient much easier while allowing
>the parser to unmarshall the protocol into the internal representation
>of the machine and present an internal API that uses the binary 
>machine and programming-langauge dependent data types, rather than keeping
>everything in string form.  Such a parser does not need to keep a list of 
>attribute names and their corresponding data types in order to know how
>to parse the value.
>
>These ideas have been included in the OMG Print Facility submission
>by Xerox and others date June 2, 1997.  We have included the IPP
>job attributes in the submission, generalized by changing "printer"
>to "device" in the attribute names, since the OMG Print Facility is
>intended to be extended to other types of services, such as scan, FAX,
>and document distribution.  The URL for the document:
>
>    ftp://www.omg.org/pub/docs/cf/97-06-03.doc
>
>
>As Steve and Brian point out, one of the side benefits to adding a
>data type code, is that one of the types can be "set-of-attributes"
>meaning that the value is a set of attributes.  The length of the
>top level attribute is the length of the entire attribute set, as usual.
>Inside that attribute value is the usual syntax for a set of attributes:
>attribute name data-type value, including multiple values for any
>individual attribute using the zero-length of the attribute-name field.
>
>Then the shipping-address attribute example can be handled using this
>new data type, and the value is a set of attributes for the fields in 
>the address.
>
>With such a data type, it greatly reduces the need to introduce other
>adhoc explicit data types or structures that contain fields.
>
>IPP V2.0 will need the ability to define attributes whose values are
>sets of attributes to facilitate the administrative specification of the 
>Printer.  For example, expressing attribute constraints between attributes
>is an example of use of the mechanism, such as can't staple transparencies
>or can't staple more than 50 sheets.
>
>Implementations that don't support the attributes that have values
>that are sets of attributes ignore the attribute entirely as currently
>and are unaffected by the addition of this one new data type, since
>the length of the top level value is the entire length and the implementation
>just ignores the entire (top level) attribute.
>
>See explicit comments on Steve's proposal about the data type codes
>and registry at the end of this message.
>
>Thanks,
>Tom
>
>
>At 17:30 06/05/97 PDT, Brian Grimshaw wrote:
>>Stephen,
>>
>>To test my understanding of your concern, I summarize the two problems 
>>you address in the form of objectives.
>>
>>* The first is to distinguish two attributes that have the same name -- 
>>the example you give is "address".
>>
>>* The second is a representation of attributes that consist of multiple 
>>pieces of information (sub-attributes) that can be easily distinguished.
>>
>>I think the first goal is best achieved by having meaningful names for 
>>attributes.  The human readable form of attributes is already a reason 
>>for meaningful names and all attributes are specified as having a type.  
>>For example, there are attributes for "job-originating-host" and 
>>"job-originating-user" which are both 'names'.  It is unlikely that both 
>>of these (or either of these) would be called "name" and, similarly, it 
>>is unlikely that shipping address would be called "address".  In this 
>>example, "address" is best considered the attribute type and "ship-to" 
>>could be an attribute of type 'address'.
>>
>>The second goal seems to be (almost) handled by the sentence in the HTTP 
>>1.1 Transport Mapping document that says "An attribute whose value is a 
>>set of n values shall be represented as a sequence of n attributes, where 
>>all but the first attribute have a name of zero length."  This specific 
>>representation could be debated, but if simplified to not have the 
>>restriction of zero length attribute names for all but the first 
>>attribute, then this would permit an arbitrary hierarchy of attributes.  
>>This syntax would be represented as:
>>
>>attribute = name-length name value-length value
>>...
>>value = octet-string | attribute
>>
>>The implicit type of an attribute is sufficient to know if it has a value 
>>of multiple attributes.
>>
>>
>>There IS a goal I can think of (not to suggest this should be a goal) 
>>that would justify your proposed solution.  If you want to be able to 
>>programmatically process (perhaps in a UI application) a set of 
>>attributes, you would need the type information to be explicitly 
>>represented -- but you would not need to know the attribute names.  The 
>>meaning of an attribute value would still require a priori knowledge of 
>>the attribute name, but much can be done without that.
>>
>>If this is not intended, I see the value-type as redundant.  If this is 
>>intended, I suggest that all types (as specified in IPP) be represented.
>>
>>Brian Grimshaw
>>Apple Computer, Inc.
>>brian at apple.com
>>
>>
>>>The Problem Statement
>>>
>>>To avoid vague generalizations, lets consider an example that I believe
>>>is likely to arise in the near future. On attribute one might want to
>>>add is an "address". For example, one might have an address to which the
>>>final output is to be mailed or shipped. One might also have an address
>>>to which the bill for reproduction is to be sent. This brings our first
>>>problem because, quite often, these two addresses are not the same.
>>>That means that we need two different address attributes.
>>>
>>>The second problem with addresses is that an address is a structured
>>>entity. It has things like a mailstop, a street number, a street name, a city
>>>sub-region identifier, a city identifier, a state and/or country
>>>identifier and, typically, some kind of ZIPcode. These are assembled
>>>into an address, but they are assembled in different ways in different
>>>cultures and countries. This means that treating the address as a single
>>>character string makes it very difficult to accurately recover the
>>>information in the address. It makes more sense to store the address as
>>>a structured objects with attributes and values for each of the
>>>component parts. But, the address (as a whole) is the value of the
>>>shipping address or billing address attribute identified above. The
>>>current proposal for IPP does not allow a value to be structured.
>>
>>...
>>
>>>A Proposed Solution
>>>
>>>The proposed solution to the extensibility problem is to add a type byte
>>>(or half word) to the value portion of the value portion of the
>>>attribute-value pair. This would change the syntax in Randy's recent
>>>draft as follows:
>>>
>>>attribute = name-length name value-type value-length value
>>>...
>>>value-type = one-byte integer ; a registered type value
>>>value-length = three-byte integer ; number of octets in value
>>>value = octet-string
>>>
>>>Note that the length was increased to three bytes to allow for larger
>>>structured values and was (arbitrarily) made three bytes so that the
>>>combination of value-type and value-length takes four bytes.
>
>I don't think that we need to increase the length beyond two octets.
>That is 64K of octets!
>
>>>
>>>It is proposed that there be a registry of value types. The first two
>>>entries in that registry would be (zero is reserved)
>>>
>>>1: Unicode string in UTF8 encoding (as specified in the draft)
>>>2: list of values (here the length of the value field determines how
>>>   many values are present. The length, however, is not the number of
>>>   value, but the number of bytes consumed by the values.
>>
>
>I would suggest that instead the registry would be the list of current
>data type keywords (page 31-33 of the line numbered I-D) with one new type:
>attributeSet:
>
>1: other
>2: text
>3: name
>4: fileName
>5: keyword
>6: uri
>7: uriScheme
>8: locale
>9: octetString
>10: booolean
>11: integer
>12: dateTime
>13: seconds
>14: milliseconds
>15: integerUnits
>16: rangeOfInteger
>17: attributeSet
>
>The two constructed data types of:
>1setOf X
>rangeOf X
>need some discussion.
>
>I don't think that we need a type code for 1setOf X, since we use the
>zero-length attribute-name to represent multiple values in a set of
>values.
>
>Currently the number of different X that use rangeOf X from the table
>on page 36 is only rangeOf int, so I put that one in explicitly.
>
>If there are other ranges needed, we can add them explicitly or use the
>attribute set approach.
>
>Tom
>
>
>
>
>
>
>
>