RE: [greenstone-devel] Metadata.xml not well formed

From Diego Spano
DateWed, 6 Aug 2003 16:52:52 -0300
Subject RE: [greenstone-devel] Metadata.xml not well formed
In-Reply-To (20030805215247-GE6174-wesson-cs-waikato-ac-nz)
Helio, as John said, you don?t need to write html &#...; codes. Download
jedit (www.jedit.org) and use it to write metadata.xlm files. You can
configure this editor to save the files in UTF-8 format, so you don?t need
to worry about special characters, simply write text in natural language and
jedit saves it in utf-8 format. Greenstone will have no problems whit this
file.

Jedit has a "Loading/Saving" configuration option and inside this topic
there is a "Default character encoding" feature, so select UTF8.

Hope you understand what I?m saying.

Lic. Diego Spano
Archivo Digital
Secretaria de Derechos Humanos
djspano@jus.gov.ar <mailto:djspano@jus.gov.ar>

-----Mensaje original-----
De: greenstone-devel-bounces@list.scms.waikato.ac.nz
[mailto:greenstone-devel-bounces@list.scms.waikato.ac.nz]En nombre de
John R. McPherson
Enviado el: Martes, 05 de Agosto de 2003 06:53 p.m.
Para: Bethany Letalien
CC: 'greenstone-devel@list.scms.waikato.ac.nz'; Helio Kuramoto
Asunto: Re: [greenstone-devel] Metadata.xml not well formed


On Tue, Aug 05, 2003 at 05:00:31PM -0500, Bethany Letalien wrote:
> Helio, you can't use accents in the metadata files like that. Try removing
> them. Make it run, then do it again with the accents in place, but use
> UNICODE (http://www.unicode.org/). You're set to UTF-8, which is right.
> You need to start the statements with &#x and then end them with ; if
> you're using the sets at http://www.unicode.org/charts/PDF/U0080.pdf (also
> look at http://www.unicode.org/charts/PDF/U0000.pdf for special character
> it won't process if typed normally). For example, informacao becomes
>
> informa&#x00E7;&#x00F5;o
>
> I just got through converting Portuguese accents to UNICODE in a
> metadata.xml file, so feel free to e-mail me off list as well in either
> language. Also, I'd be curious to hear about your work at IBICT....

Hi,
you don't have to use the html &#...; codes. You can use the accented
characters, as long as they are using utf-8 codes, and not latin
(iso-8859 or windows codepage 1252) characters sets.

I'm not sure how to do this with windows text editors - on linux
you can use the "iconv" program to convert files from one encoding to
another.

John McPherson

_______________________________________________
greenstone-devel mailing list
greenstone-devel@list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-devel