PWG-ANNOUNCE> FW: file system conversion tool [8-bit to UTF-8]

PWG-ANNOUNCE> FW: file system conversion tool [8-bit to UTF-8]

McDonald, Ira imcdonald at sharplabs.com
Sat Dec 6 13:58:57 EST 2003


Hi folks,

[from the Open Linux UTF-8 (internationalization) list]

The tool described below would be very helpful to anyone
with a Linux, MacOS, or UNIX file system that they want
to convert fully to UTF-8 (filenames are converted, file
contents are NOT converted).

Cheers,
- Ira

Ira McDonald (Musician / Software Architect)
Blue Roof Music / High North Inc
PO Box 221  Grand Marais, MI  49839
phone: +1-906-494-2434
email: imcdonald at sharplabs.com

-----Original Message-----
From: Jungshik Shin [mailto:jshin at mailaps.org]
Sent: Friday, December 05, 2003 6:43 PM
To: Linux UTF8 list
Subject: file system conversion tool [8-bit to UTF-8]



Hi,

I thought some of you might be interested in 'convmv', a file system
encoding conversion utility I just came across. Most of you on this list
are likely to have switched over to UTF-8 and wrote a script or two for
the job.  Nonetheless, it may be handy to have tools like this nearby
so that you can help other 'skeptics' around you to 'convert' to UTF-8.

http://osx.freshmeat.net/releases/144059/

convmv converts filenames (not file content), directories, and even
whole filesystems to a different encoding. This comes in very handy if,
for example, one switches from an 8-bit locale to an UTF-8 locale. It
has some smart features: it automagically recognises if a file is
already UTF-8 encoded (thus partly converted filesystems can be fully
moved to UTF-8) and it also takes care of symlinks. Additionally, it is
able to convert from normalization form C (UTF-8 NFC) to NFD and
vice-versa. This is important for interoperability with Mac OS X, for
example, which uses NFD, while Linux and most other Unixes use NFC.
Though it's primary written to convert from/to UTF-8 it can also be used
with almost any other charset encoding. Note that this is a command line
tool which requires at least Perl version 5.8.0.


Jungshik
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/



More information about the Pwg-announce mailing list