[from the Open Linux UTF-8 (internationalization) list]
The tool described below would be very helpful to anyone
with a Linux, MacOS, or UNIX file system that they want
to convert fully to UTF-8 (filenames are converted, file
contents are NOT converted).
Ira McDonald (Musician / Software Architect)
Blue Roof Music / High North Inc
PO Box 221 Grand Marais, MI 49839
email: imcdonald at sharplabs.com
From: Jungshik Shin [mailto:jshin at mailaps.org]
Sent: Friday, December 05, 2003 6:43 PM
To: Linux UTF8 list
Subject: file system conversion tool [8-bit to UTF-8]
I thought some of you might be interested in 'convmv', a file system
encoding conversion utility I just came across. Most of you on this list
are likely to have switched over to UTF-8 and wrote a script or two for
the job. Nonetheless, it may be handy to have tools like this nearby
so that you can help other 'skeptics' around you to 'convert' to UTF-8.
convmv converts filenames (not file content), directories, and even
whole filesystems to a different encoding. This comes in very handy if,
for example, one switches from an 8-bit locale to an UTF-8 locale. It
has some smart features: it automagically recognises if a file is
already UTF-8 encoded (thus partly converted filesystems can be fully
moved to UTF-8) and it also takes care of symlinks. Additionally, it is
able to convert from normalization form C (UTF-8 NFC) to NFD and
vice-versa. This is important for interoperability with Mac OS X, for
example, which uses NFD, while Linux and most other Unixes use NFC.
Though it's primary written to convert from/to UTF-8 it can also be used
with almost any other charset encoding. Note that this is a command line
tool which requires at least Perl version 5.8.0.
Linux-UTF8: i18n of Linux on all levels