Hi,
My two cents:
(1) [answering Elliot]
Unicode normalization has no impact at all on the CR specs -
- they merely refer to character repertoires (often including
both composed and uncomposed characters) which are defined
(in _all_ cases) by some other standards body (Unicode, ISO,
IANA, etc.).
(2) [answering Jim]
No - a printer should _never_ throw away any document data
that happens not to be normalized (it is actually very
difficult to determine if that data is already in Unicode
NFC or NFKC, except by doing the whole normalization and
then doing binary compare of the results with original).
(3) [answering Jim]
No - a printer should _never_ trust the sender/generator
to have properly normalized Unicode data.
(4) [my own comment]
Early Uniform Normalization is important and useful for
_very_ small pieces of data and _narrow_ fields of
application (such as IETF's I18N Domain Names standards).
The day will never come that receivers need not check
for (or simply perform) normalization, if needed. Some
rendering algorithms happen to require that Unicode data
be pre-normalized, but that's an implementation nit.
Cheers,
- Ira McDonald
High North Inc
-----Original Message-----
From: elliott.bradshaw at zoran.com [mailto:elliott.bradshaw at zoran.com]
Sent: Friday, September 19, 2003 10:23 AM
To: BIGELOW,JIM (HP-Boise,ex1)
Cc: 'cr at pwg.org'; owner-cr at pwg.org
Subject: Re: CR> W3C Character Model and Early Uniform Normalization
What are the XHTML-Print operations that are affacted by normalization?
This discussion is useful for string processing (match, substring, sort)
but I don't see how that affects printing. One possible area is CSS class
names; are they restricted to ASCII?
Also, I don't see how a new report can change the definition of an existing
spec (XHTML). Isn't this a separate set of rules that might be folded into
future revisions?
I would rather see a use-case that makes sense for XHTML-Print before
adding this in.
E.
P.S. Does it have any effect on current CR documents? I don't think so.
There is no discussion of combining in there at all.
----------------------------------------------------------------------------
----
Elliott Bradshaw
Director, Software Engineering
Zoran Imaging Group (formerly Oak Technology Imaging Group)
781 638-7534
"BIGELOW,JIM
(HP-Boise,ex1) To: "'cr at pwg.org'" <cr at pwg.org>
" cc:
<jim.bigelow at h Subject: CR> W3C Character
Model and Early
p.com> Uniform Normalization
Sent by:
owner-cr at pwg.o
rg
09/18/2003
08:01 PM
Hello,
I've been reading the W3C Working Draft, Character Model for the World Wide
Web [1], which deals with requires of internet applications should as
producers and consumers of XHTML-Print.
This report [1] indicates that XHTML-Print as a derivate of XHTML is bound
by it. Therefore, by extension, all XHTML-Print producing and consuming
applications are bound by this report all thought this is never explicitly
stated in any version of the XHTML-Print specification [2,3].
One of the interesting parts of [1] is the requirement that applications
that produce XHTML-Print should produce fully-normalized text [4] meaning,
among other things, that it is in Unicode Normalized Form C [5], which
favors the canonical composite forms of Unicode characters.
>From the printer's perspective, as a receiver of XHTML-Print documents,
this
makes its job easier since it can always assume that text is
fully-normalized and it doesn't have to do so itself.
My question to you is, do you think that the XHTML-Print specification
should be amended to site the requirement that a conforming XHTML-Print
document be fully-normalized? Furthermore, should a printer be required to
check an XHTML-Print document to see that it is fully-normalized or should
it assume so? Lastly, should a printer normalize text that is not
fully-normalized or discard it?
Jim
--
Jim Bigelow,
Editor: XHTML-Print & CSS Print Profile
Member: W3C HTML and CSS Working Groups
Hewlett-Packard
208-396-2068
jim.bigelow at hp.com
[1] http://www.w3.org/TR/charmod/
[2] http://www.pwg.org/xhtml-print/HTML-Version/XHTML-Print.html
[3] http://www.w3.org/TR/xhtml-print/
[4] http://www.w3.org/TR/2003/WD-charmod-20030822/#sec-FullyNormalized
[5] http://www.unicode.org/unicode/reports/tr15/#Specification