Re: [greenstone-users] Fields pdf and metadata

From Pier.Luigi.Rossi@bondy.ird.fr
DateMon, 11 Oct 2004 10:32:00 +0200
Subject Re: [greenstone-users] Fields pdf and metadata
In-Reply-To (1097187785-23774-33-camel-puriri-cs-waikato-ac-nz)
Hi John

I have install the 3 files.
I find new metadata : "NumPages"
I can't find author or data metadata.
In an other mail I send you the pdf and xml files.

Regards

P.L. Rossi
IRD

A 11:23 08/10/2004 +1300, John R. McPherson a écrit :
>On Thu, 2004-10-07 at 01:02, Pier.Luigi.Rossi@bondy.ird.fr wrote:
> > Hi,
> > I would like to know if it is possible to extract pdf fields (like author,
> > subject, keywords)
> > as metadata in greenstone and index it ... like the filed title.
>
>Hi,
>we use a 3rd-party program, called "pdftohtml", to extract text from
>.PDF files. Unfortunately it doesn't seem to extract all the document
>metadata, but it does extract title, date, and author.
>
>We have an updated PDF plugin that will now use the date and author
>metadata from the file, which you can get below. It won't get the
>keywords or subject metadata though. Note that the author metadata will
>be renamed to "Creator", and date will be renamed to "Date" metadata
>inside greenstone.
>
>Save this file into the <greenstone dir>perllibplugins directory:
>http://www.greenstone.org/tmp/PDFPlug.pm
>
>This PDF plugin also needs updated versions of the following 2 files:
>http://www.greenstone.org/tmp/HTMLPlug.pm
>(saved to perllibplugins)
>and
>http://www.greenstone.org/tmp/unicode.pm
>(saved to perllib).
>
>John McPherson
>
>
>_______________________________________________
>greenstone-users mailing list
>greenstone-users@list.scms.waikato.ac.nz
>https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users