Re: [greenstone-users] PDF plugin and number of pages

From Eduardo TrĂ¡pani
DateThu, 23 Sep 2004 12:18:13 -0200
Subject Re: [greenstone-users] PDF plugin and number of pages
In-Reply-To (41520647-5FFF4231-cs-waikato-ac-nz)
Hi Michael,

> In terms of the "number of pages" metadata, luckily this isn't too difficult to add. The pdftohtml program that Greenstone uses creates anchor tags in the HTML output (<a name=1>, <a name=2> etc.) at the start of each page. It is fairly simple to look for these tags and count them to
> determine the number of pages. I've added this code near the end of gsdl/perllib/plugins/PDFPlug.pm:
[...]

It works! Thanks.

> If you want to add the file size metadata yourself you'll need to determine the size of the original file, then call add_utf8_metadata as I've done above.

$filestat = stat($filename);
$doc_obj->add_metadata($doc_obj->get_top_section(), "FileSize", $filestat[7]);

I added the code above in sub read, BasPlug.pm, right after:

$doc_obj->add_utf8_metadata($doc_obj->get_top_section(), "Plugin", "$self->{'plugin_type'}");

But it doesn't work. What am I missing? I thought of BasPlug because I would like to have the FileSize element in all files, not just PDFs.

> Hope this helps,

It certainly did help.

I'm beginning to like the structure of the program. Too bad I'm not fluent in perl. But then again, there isn't a lot to change, it just works.

Eduardo.