Re: [greenstone-devel] bug in import of ascended filenames?

From John R. McPherson
DateFri, 27 Jun 2003 17:44:25 +1200
Subject Re: [greenstone-devel] bug in import of ascended filenames?
In-Reply-To (17553-47962-21758-1293867209-1056553422-seznam-cz)
roman chyla wrote:
> Hi,
>
> yesterday I made some interesting test. I saved 4 word files; the
> first four had ascended characters in filename ; the second group had
> only ascii7 chars in filename. The content of both groups were in
> non-ascii encoding.
>
> all 8 files were imported
>
> the first 4 files (with ascended chars) were rejected during building

Hi,
I don't think we've ever checked the filenames, so the code is assuming
that it gets UTF-8.

I'm not sure if we have a way to reliably check the encoding of the
filename. Which operating system and filesystem type are you using?

I tried this with gsdl-2.39 and accents with a character code below
(decimal) 128 are ok, but accents above 127 cause the file to be
badly-formed XML.

Thanks for pointing this out. Sorry I can't suggest a fix or a
work-around...


John