|Date||Mon, 03 May 2004 08:35:04 +0200|
|Subject||Re: [greenstone-users] encoding problem under linux|
|John R. McPherson wrote:
> On Sun, May 02, 2004 at 03:23:15PM +0200, jens wille wrote:
>>i have to build a collection from plain text files which contain
>>non-ascii characters - originally they are encoded in ISO-8859-1
>>the problem now is that i use these files to create a metadata.xml
>>by extracting text and inserting it into meta tags. as a consequence
>>this yields a "not well formed" metadata.xml!
> in Greenstone the metadata.xml files must be encoded using UTF-8.
well, if i convert the metadata.xml to utf-8 after creating it, the
collection builds, but for almost every doc.xml i get "no plugin
could handle this file". i suppose that the doc.xml's are not
properly encoded ('file -i doc.xml' yields charset "unknown").
thank you anyway, but i'm afraid it's not as easy as that :-(