Re: [greenstone-users] Fields pdf and metadata

DateMon, 11 Oct 2004 10:32:00 +0200
Subject Re: [greenstone-users] Fields pdf and metadata
In-Reply-To (1097187785-23774-33-camel-puriri-cs-waikato-ac-nz)
Hi John

I have install the 3 files.
I find new metadata : "NumPages"
I can't find author or data metadata.
In an other mail I send you the pdf and xml files.


P.L. Rossi

A 11:23 08/10/2004 +1300, John R. McPherson a écrit :
>On Thu, 2004-10-07 at 01:02, wrote:
> > Hi,
> > I would like to know if it is possible to extract pdf fields (like author,
> > subject, keywords)
> > as metadata in greenstone and index it ... like the filed title.
>we use a 3rd-party program, called "pdftohtml", to extract text from
>.PDF files. Unfortunately it doesn't seem to extract all the document
>metadata, but it does extract title, date, and author.
>We have an updated PDF plugin that will now use the date and author
>metadata from the file, which you can get below. It won't get the
>keywords or subject metadata though. Note that the author metadata will
>be renamed to "Creator", and date will be renamed to "Date" metadata
>inside greenstone.
>Save this file into the <greenstone dir>perllibplugins directory:
>This PDF plugin also needs updated versions of the following 2 files:
>(saved to perllibplugins)
>(saved to perllib).
>John McPherson
>greenstone-users mailing list