Re: [greenstone-users] encoding problem under linux

From Rene Schrama
DateMon, 03 May 2004 09:18:36 +0200
Subject Re: [greenstone-users] encoding problem under linux

I had a similar problem with my CD-ROM collection. I solved it by
changing the XML header as follows:

<?xml version="1.0" encoding="iso-8859-1" standalone="no"?>

I also used "-input_encoding iso_8859_1" on all plugins. This is to
indicate that your source files are ISO-8859-1.


>>> jens wille <> 03-05-2004 08:35:04 >>>
John R. McPherson wrote:
> On Sun, May 02, 2004 at 03:23:15PM +0200, jens wille wrote:
>>i have to build a collection from plain text files which contain
>>non-ascii characters - originally they are encoded in ISO-8859-1
>>(windows ansi).
>>the problem now is that i use these files to create a metadata.xml
>>by extracting text and inserting it into meta tags. as a consequence

>>this yields a "not well formed" metadata.xml!
> Hi,
> in Greenstone the metadata.xml files must be encoded using UTF-8.
well, if i convert the metadata.xml to utf-8 after creating it, the
collection builds, but for almost every doc.xml i get "no plugin
could handle this file". i suppose that the doc.xml's are not
properly encoded ('file -i doc.xml' yields charset "unknown").

thank you anyway, but i'm afraid it's not as easy as that :-(


jens wille

greenstone-users mailing list