IPP Mail Archive: Re: IPP> PRO - A value-type field for the Protocol Spec proposa

Re: IPP> PRO - A value-type field for the Protocol Spec proposa

Tom Hastings (hastings@cp10.es.xerox.com)
Fri, 6 Jun 1997 10:54:07 PDT

I agree with Steve's suggestion to add a data type code to the encoding
of attributes in IPP and Brian's feedback.

We have concluded an internal review of IPP at Xerox, and adding a
data type code was the most important issue raised. We have had
a year and a half experience with an ASCII encoded attribute representation
that includes the data type. In our implementation, the data type
was a keyword too, but a data type code is sufficient for IPP, where
each of the data type keywords is assigned a one-octet code.

I don't think we need to assign new data type codes for binary integer
though. We need to keep the data types as simple and small in number
as possible.

The main reason that we included the data type is that it makes the
implementation of the parser on the recipient much easier while allowing
the parser to unmarshall the protocol into the internal representation
of the machine and present an internal API that uses the binary
machine and programming-langauge dependent data types, rather than keeping
everything in string form. Such a parser does not need to keep a list of
attribute names and their corresponding data types in order to know how
to parse the value.

These ideas have been included in the OMG Print Facility submission
by Xerox and others date June 2, 1997. We have included the IPP
job attributes in the submission, generalized by changing "printer"
to "device" in the attribute names, since the OMG Print Facility is
intended to be extended to other types of services, such as scan, FAX,
and document distribution. The URL for the document:

ftp://www.omg.org/pub/docs/cf/97-06-03.doc

As Steve and Brian point out, one of the side benefits to adding a
data type code, is that one of the types can be "set-of-attributes"
meaning that the value is a set of attributes. The length of the
top level attribute is the length of the entire attribute set, as usual.
Inside that attribute value is the usual syntax for a set of attributes:
attribute name data-type value, including multiple values for any
individual attribute using the zero-length of the attribute-name field.

Then the shipping-address attribute example can be handled using this
new data type, and the value is a set of attributes for the fields in
the address.

With such a data type, it greatly reduces the need to introduce other
adhoc explicit data types or structures that contain fields.

IPP V2.0 will need the ability to define attributes whose values are
sets of attributes to facilitate the administrative specification of the
Printer. For example, expressing attribute constraints between attributes
is an example of use of the mechanism, such as can't staple transparencies
or can't staple more than 50 sheets.

Implementations that don't support the attributes that have values
that are sets of attributes ignore the attribute entirely as currently
and are unaffected by the addition of this one new data type, since
the length of the top level value is the entire length and the implementation
just ignores the entire (top level) attribute.

See explicit comments on Steve's proposal about the data type codes
and registry at the end of this message.

Thanks,
Tom

At 17:30 06/05/97 PDT, Brian Grimshaw wrote:
>Stephen,
>
>To test my understanding of your concern, I summarize the two problems
>you address in the form of objectives.
>
>* The first is to distinguish two attributes that have the same name --
>the example you give is "address".
>
>* The second is a representation of attributes that consist of multiple
>pieces of information (sub-attributes) that can be easily distinguished.
>
>I think the first goal is best achieved by having meaningful names for
>attributes. The human readable form of attributes is already a reason
>for meaningful names and all attributes are specified as having a type.
>For example, there are attributes for "job-originating-host" and
>"job-originating-user" which are both 'names'. It is unlikely that both
>of these (or either of these) would be called "name" and, similarly, it
>is unlikely that shipping address would be called "address". In this
>example, "address" is best considered the attribute type and "ship-to"
>could be an attribute of type 'address'.
>
>The second goal seems to be (almost) handled by the sentence in the HTTP
>1.1 Transport Mapping document that says "An attribute whose value is a
>set of n values shall be represented as a sequence of n attributes, where
>all but the first attribute have a name of zero length." This specific
>representation could be debated, but if simplified to not have the
>restriction of zero length attribute names for all but the first
>attribute, then this would permit an arbitrary hierarchy of attributes.
>This syntax would be represented as:
>
>attribute = name-length name value-length value
>...
>value = octet-string | attribute
>
>The implicit type of an attribute is sufficient to know if it has a value
>of multiple attributes.
>
>
>There IS a goal I can think of (not to suggest this should be a goal)
>that would justify your proposed solution. If you want to be able to
>programmatically process (perhaps in a UI application) a set of
>attributes, you would need the type information to be explicitly
>represented -- but you would not need to know the attribute names. The
>meaning of an attribute value would still require a priori knowledge of
>the attribute name, but much can be done without that.
>
>If this is not intended, I see the value-type as redundant. If this is
>intended, I suggest that all types (as specified in IPP) be represented.
>
>Brian Grimshaw
>Apple Computer, Inc.
>brian@apple.com
>
>
>>The Problem Statement
>>
>>To avoid vague generalizations, lets consider an example that I believe
>>is likely to arise in the near future. On attribute one might want to
>>add is an "address". For example, one might have an address to which the
>>final output is to be mailed or shipped. One might also have an address
>>to which the bill for reproduction is to be sent. This brings our first
>>problem because, quite often, these two addresses are not the same.
>>That means that we need two different address attributes.
>>
>>The second problem with addresses is that an address is a structured
>>entity. It has things like a mailstop, a street number, a street name, a city
>>sub-region identifier, a city identifier, a state and/or country
>>identifier and, typically, some kind of ZIPcode. These are assembled
>>into an address, but they are assembled in different ways in different
>>cultures and countries. This means that treating the address as a single
>>character string makes it very difficult to accurately recover the
>>information in the address. It makes more sense to store the address as
>>a structured objects with attributes and values for each of the
>>component parts. But, the address (as a whole) is the value of the
>>shipping address or billing address attribute identified above. The
>>current proposal for IPP does not allow a value to be structured.
>
>...
>
>>A Proposed Solution
>>
>>The proposed solution to the extensibility problem is to add a type byte
>>(or half word) to the value portion of the value portion of the
>>attribute-value pair. This would change the syntax in Randy's recent
>>draft as follows:
>>
>>attribute = name-length name value-type value-length value
>>...
>>value-type = one-byte integer ; a registered type value
>>value-length = three-byte integer ; number of octets in value
>>value = octet-string
>>
>>Note that the length was increased to three bytes to allow for larger
>>structured values and was (arbitrarily) made three bytes so that the
>>combination of value-type and value-length takes four bytes.

I don't think that we need to increase the length beyond two octets.
That is 64K of octets!

>>
>>It is proposed that there be a registry of value types. The first two
>>entries in that registry would be (zero is reserved)
>>
>>1: Unicode string in UTF8 encoding (as specified in the draft)
>>2: list of values (here the length of the value field determines how
>> many values are present. The length, however, is not the number of
>> value, but the number of bytes consumed by the values.
>

I would suggest that instead the registry would be the list of current
data type keywords (page 31-33 of the line numbered I-D) with one new type:
attributeSet:

1: other
2: text
3: name
4: fileName
5: keyword
6: uri
7: uriScheme
8: locale
9: octetString
10: booolean
11: integer
12: dateTime
13: seconds
14: milliseconds
15: integerUnits
16: rangeOfInteger
17: attributeSet

The two constructed data types of:
1setOf X
rangeOf X
need some discussion.

I don't think that we need a type code for 1setOf X, since we use the
zero-length attribute-name to represent multiple values in a set of
values.

Currently the number of different X that use rangeOf X from the table
on page 36 is only rangeOf int, so I put that one in explicitly.

If there are other ranges needed, we can add them explicitly or use the
attribute set approach.

Tom