RE: [greenstone-devel] Import fails with Russian metadata

From Emanuel Dejanu / Simple Words
DateThu, 22 Apr 2004 10:18:20 +0300
Subject RE: [greenstone-devel] Import fails with Russian metadata
In-Reply-To (4086F757-8000405-cs-waikato-ac-nz)
You can use iconv to convert from koi8 or windows-1251 to utf8

On windows add iconv lib directory to your path.

And run the command

C:Libsiconvutiliconv -f KOI8-R -t UTF-8 metadata.xml > metadata_utf8.xml

-- or --

C:Libsiconvutiliconv -f WINDOWS-1251 -t UTF-8 metadata.xml >
metadata_utf8.xml


Here you will find binaries for windows (for unix is easy to find)
ftp://ftp.zlatkovic.com/pub/libxml/

If you can not get it send me an e-mail and I will send to you by e-mail.
But I think that you are on a unix compatible system.


Best regards,

Emanuel Dejanu

-----Original Message-----
From: greenstone-devel-bounces@list.scms.waikato.ac.nz
[mailto:greenstone-devel-bounces@list.scms.waikato.ac.nz] On Behalf Of John
R. McPherson
Sent: Thursday, April 22, 2004 1:36 AM
To: Doug Carter
Cc: Greenstone Mailing List
Subject: Re: [greenstone-devel] Import fails with Russian metadata

Doug Carter wrote:
> Hi all,
>
> I've got a problem importing a collection that has Russian characters
> in the metadata.xml file. I thought that there was support for foreign
> character sets, so I don't know how to go about fixing this.
>
> When I import, the RecPlug dies with a parse error:
>
> Uncaught exception from user code:
> RecPlug: ERROR /usr/local/gsdl/collect/progdev/import/metadata.xml is
not a well formed metadata.xml file (
> not well-formed (invalid token) at line 4486, column 44, byte 190213 at
/usr/local/gsdl-build/perllib/cpan/XML/Parser.pm line 187
> )
>
> The character it doesn't like is the *second* Russian character in
> the metadata field.
>
> Any ideas?

Hi,
I think that the metadata.xml files must be encoded in unicode UTF-8.
You can have Russian (or anything) as long as it is utf-8, and not in a
Cyrillic encoding (eg koi8 or windows-1251).

John McPherson