Re: [greenstone-users] Problem Unicode

From Michael Dewsnip
DateMon, 29 Nov 2004 10:10:02 +1300
Subject Re: [greenstone-users] Problem Unicode
In-Reply-To (41A6FAEC-1040203-reltech-org)
Hi Tim,

You should check that the non-ASCII characters in the metadata.xml file
are encoded as UTF-8 correctly. If you use something fairly primitive
like "less" to view the file you should see at least two bytes for
non-ASCII characters like "□". If there is only one byte then it is
likely to be encoded as ISO 8859-1, which won't work.

All the best,

Michael

Tim Finney wrote:

> I would like to use names that include diacritics in the metadata.xml
> files for a collection.
>
> Here is an example:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <DirectoryMetadata>
> <FileSet>
> <FileName>.*.html</FileName>
> <Description>
> <Metadata name="EdID">P.Herc.208 col. 12b</Metadata>
> <Metadata name="EdTitle">In Platonis Lysin</Metadata>
> <Metadata name="EdCreator">W. Cr□nert</Metadata>
> </Description> </FileSet>
> </DirectoryMetadata>
>
> When I build the collection, any HTML file(s) associated with metadata
> files that include characters like □ (LATIN SMALL LETTER O WITH
> DIAERESIS) fail to appear.
>
> This happens with Fedora Core 1 + Greenstone 2.50 and Fedora Core 2 +
> Greenstone 2.51.
>
> Any ideas what might be wrong?
>
> Best
>
> Tim Finney
>
>
>
> _______________________________________________
> greenstone-users mailing list
> greenstone-users@list.scms.waikato.ac.nz
> https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users
>