IPP>MOD add another issue [encoding of CompoundValue]

Tue Nov 25 21:54:45 EST 1997

I will try to explain the issue by giving more detail.

The compoundValue has an integer value which specifies the number of
following values that compose the compound value.  There are two
obvious ways to implement compoundValue in a general way:

   1) recurse looking for additional values until the correct number
      is found or until a non-null attribute name is found or a delimiter
      tag is found. The latter two conditions are errors. This method
      works, but is tricky the "nested" values are really at the same
      level as other values in the protocol.

   2) continue picking up values, but make a note that a compoundValue
      is being built.  In this case, there must be a check when a
      non-null name is encountered, and when a delimiter tag is found
      for the error of a compoundValue is still being built.
      At first glance, this seems simpler, but it is easy to forget the
      checks mentioned above. 

Although compoundValue can be made to work, its complexity will lead to
bugs.  Also its type is determined by looking at all of the tags of
values that it contains.  This suggests that we should look for a
simpler-to-implement option.

The most obvious solution is to add two new types text-language and
name-language which are the langauge constrained versions of text and
name. Attributes with text and name values could also have a value of
type text-language or name-language.  Tom and others have suggested
that language and text/name be separated by a single-quote character.
It would work, but is not in the spirit of the current protocol which
uses lengths instead of delimiting characters. So I suggest the value
be <language length> <language string> <text/name string>.  The length
of the text/name string is length of the value minus ( language-length
+ 2).  

This solution is easier to parse because the components are contained
with a single value.

If we believe that tags are in short supply and that we don't want to
allocate two values for such obscure types, we could create a tag type
of "typed-octets" where the first byte of the value contains the
sub-tag value which in our case would be either text-language or
name-language. We could also have 2 bytes for the sub-tag rather than
1.

> From hastings at cp10.es.xerox.com Mon Nov 24 10:46:48 1997
> 
> As long as you've re-opened this issue, I'd like to add several
> other alternatives into the mix.  (A committee is better able to
> pick between alternatives, than to design one on the fly).
> 
> On the other hand, it may be better to live with the current scheme
> than to try to pick a new one.
> 
> At 19:48 11/21/1997 PST, Robert Herriot wrote:
> >
> >As I am implementing the CompoundValue, I am finding problems that make
> >me think it should be changed. It requires too much special-casing and
> >in its current form it will introduce bugs where the value of the
> >CompoundValue exceeds the number of remaining attributes for the
> >attribute name or attribute group.  To avoid those bugs, checks have to
> >be made in several places.
> 
> Please explain this problem more.
> 
> >
> >I suggest we reexamine the other possible solutions, one simple with 
> >no room for extension, the other with room for extension.
> >
> >  a) add two new value types: text-language and name-language each of which
> >     is a single value in the protocol but which consists of 4 subfields:
> >     a text/name length field, a text/name field, a language length field, 
> >     and a language field, .
> >
> >  b) add a single new type: compound-value which consists of a single value
> >     in the protocol but which consists of a value-tag field followed by 
> >     any number of groups-of-three subfields. Each group-of-three 
> >     consists of a value tag, a value length and a value. The text-language 
> >     solution of a) is represented by a text-language tag, a text tag, a 
> >     text length, a text value, a natural-language-tag, a natural-language
> >     length and a natural-language value.
> >
> >I prefer b) because it offers room for extension and an implementation can
> >determine if it supports the compound value by examining the initial
> >tag in the compoundValue.
> 
> Here are my additional alternatives:
> 
>  
> c) Amplify the 'text' and 'name' attribute syntaxes to allow a second
> natural language override value to precede the actual value which indicates 
> the language of the immediately following value.  The attribute syntax of 
> the first value, when present, is: 'naturalLanguage' as defined in the
> current spec.
> 
> Advantages:  simple
> 
> Disadvantages:  a single-valued attribute sometimes has two values, making
> the validation of single-valued attributes more adhoc.  Also counting
> attribute values is more adhoc. 
> 
> 
> d) have two data types for each of 'text' and 'name': 
>    'text' (same as current) and 'taggedText'
>    'name' (same as current) and 'taggedName'
> 
> The 'taggedText' and 'taggedName' data types use the RFC 2184 tagging
> in the beginning of the data (but for language only, not charset)
> to indicate natural language override:
> 
>    en'...
>    en-us'...
> 
> to indicate English and U.S. English
> 
> Those attributes which currently have 'text' and 'name' would
> be changed to require support of both 'text' and 'taggedText'
> and 'name' and 'taggedName'
> 
> For example:
> 
>   job-name (name | taggedName)
> 
> Advantages:  most request/response instances would not need to use the 
> taggedText and taggedName in most interchanges.
> 
> Disadvantages:  clients and IPP objects would still have to support both
> forms.
> 
> 
> e) Change the spec for 'text' and 'name' to always require the RFC 2184
> natural language prefix (but not charset).
> 
> Advantages:  simple, natural language tag is always stored with the data.
> Only one protocol value for each attribute value.
> 
> Disadvantages:  tag has to be skipped over when processing or displaying
> the data.
> 
> 
> f) Same as e) except include the charset tag as well, so in full compliance
> with RFC 2184 (same as we had in the Model document after the Atlanta 
> meeting).  Example:
> 
>   us-ascii'en'...
>   utf-8'en-us'...
> 
> Advantages:  simple, charset and natural language tag is always stored 
> with the data.  Only one protocol value for each attribute value.
> IPP object doesn't need to charset convert data to a single charset.
> 
> Disadvantage:  tags have to be skipped over when processing or displaying
> the data.
> 
> 
> g) Add the dictionary attribute syntax that we postponed.
> 
> Advantages:  It is even more general than your alternative b) and is
> something we have agreed is something we want.  I'd hate to see us
> put in something that is half a dictionary.  I think that the dictionary
> also fixes the length checking problem that the current CompoundValue
> has, correct?
> 
> Disadvantages:  None.
> 
> Tom
> 
> 
> 
>