RE: [greenstone-devel] A new HTML processing plugin

From Emanuel Dejanu
DateFri, 27 May 2005 08:51:54 +0300
Subject RE: [greenstone-devel] A new HTML processing plugin
In-Reply-To (4296AB3B-9010801-cs-waikato-ac-nz)
Hi Michel,

There is no problem in putting the new HTML plug-in code in HTMLPlug.
For me was more simple this way because we have change a lot the
greenstone perl code and is very hard for me to keep it in sync with
greenstone cvs. Also was more easy for me to send code to you without
sending all the changes that we have done in HTMLPlug.

If you want please include it as an option in HTMLPlug, I do
not see any technical problem.

Some changes that I think worth GSDL team attention are (that we have in our
code):
- Multilanguage metadata (metadata.xml)

Example:

<Metadata name="Title">Default title</Metadata>
<Metadata name="Title" lang="fr">This is a french title</Metadata>
<Metadata name="Title" lang="es">This is a spanish title</Metadata>

- Section metadata (metadata.xml)

<Description><!-- This is top section metadata -->
</Description>
<Sections>
<Section secid="1.1">
<Metadata
name="AssocLang">en:s2933e;fr:s2956f.13.4</Metadata>
<Metadata name="AuthorIdx"
mode="accumulate">7</Metadata>
</Section>
</Sections>

Please not that this change the metadata parameter that have a different
structure.

Best regards,

Emanuel Dejanu

-----Original Message-----
From: Michael Dewsnip [mailto:mdewsnip@cs.waikato.ac.nz]
Sent: Friday, May 27, 2005 8:08 AM
To: Emanuel Dejanu
Cc: greenstone-devel@list.scms.waikato.ac.nz
Subject: Re: [greenstone-devel] A new HTML processing plugin

Hi Emanuel,

Thanks very much for your e-mail, and for the new plugin. The ability to
split HTML documents into sections based on the heading tags is obviously
very useful, and something that we've been trying to find the time to do for
ages!

We'd definitely like to include this in Greenstone. However I did wonder
whether this would be better as an option to the existing HTMLPlug, rather
than a new plugin (even if it does inherit from HTMLPlug). Did you
investigate this and find that the new code didn't fit well in the existing
plugin?

Thanks again for your contribution!

All the best,

Michael

Emanuel Dejanu wrote:

>Hi,
>
>Maybe this will be of help also to you.
>This new plugin split an HTML document in sections using the heading
>tags (h1, h2, ..., hN).
>
>You can also specify aditional metadata using the comment inside
>heading tag:
>
><!--gsdl-metadata
><Metadata name="Example">This is the metadata value for metadata
>&quot;Example&quot;</Metadata>
>-->
>
>Full example:
>
><h2>1. Infections<!--gsdl-metadata
><Metadata name="AssocCtry">ken:h4329e.16;tza:h4336e.4.17</Metadata>
>-->
></h2>
>
>
>The metadata "Title" will have the value "1. Infections" for this section.
>
>
>Best regards,
>
>Emanuel Dejanu
>
>
>-----------------------------------------------------------------------
>-
>
>_______________________________________________
>greenstone-devel mailing list
>greenstone-devel@list.scms.waikato.ac.nz
>https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-devel
>
>