CR> W3C Character Model and Early Uniform Normalization

CR> W3C Character Model and Early Uniform Normalization

McDonald, Ira imcdonald at sharplabs.com
Wed Sep 24 10:21:49 EDT 2003


Hi Jim,

To reduce the implementation burden, I suggest that XHTML-Print
state the a conforming Printer SHOULD normalize the document
data to NFC (citing UAX-15 as the authoritative source).  

Since W3C Charmod is still a working draft, XHTML-Print should 
NOT have a Normative reference to W3C Charmod (which would 
prevent publication of XHTML-Print as PWG Candidate Standard).

Because normalization is a fairly costly activity on large
volumes of data (I wrote the normalization library for the
forthcoming CUPS 1.2 release), I suggest that the XHTML-Print
conformance be SHOULD rather than MUST.

Cheers,
- Ira McDonald
  High North Inc


-----Original Message-----
From: BIGELOW,JIM (HP-Boise,ex1) [mailto:jim.bigelow at hp.com]
Sent: Monday, September 22, 2003 6:52 PM
To: 'cr at pwg.org'
Subject: RE: CR> W3C Character Model and Early Uniform Normalization


Ira wrote:
> 
> (2) [answering Jim] 
>     No - a printer should _never_ throw away any document data 
>     that happens not to be normalized ...

I agree.  However, the XHTML-Print spec [1, 2, 3] in their Printer
Conformance sections that a printer may "flush or otherwise reject a
non-conforming XHTML-Print document."  This is the source of my worry that a
printer could reject a document that is not normalized.
> 
> (3) [answering Jim]
>     No - a printer should _never_ trust the sender/generator 
>     to have properly normalized Unicode data.

If a very low cost printer assumed that an XHTML-Print document's content is
normalized and it is not, the very worse that could happen is that word
breaks occur in the wrong place, e.g., between a letter and it's non-spacing
mark, or class/id selectors don't match the value of the class/id attribute
-- causing the misapplication of style sheet rules. 

I think the a printer should normalize and therefore correctly handle
combining characters. I just wondering if other printer people think such a
normalization should be mandated for all printers.

Jim

[1] ftp://ftp.pwg.org/pub/pwg/xhtml-print/drafts/xhtml-print-draft-095.pdf
[2] http://www.pwg.org/xhtml-print/HTML-Version/XHTML-Print.html
{3]  



More information about the Cr mailing list