|Date||Thu, 23 Sep 2004 11:09:59 +1200|
|Subject||Re: [greenstone-users] PDF plugin and number of pages|
Yes, these are two useful bits of metadata that PDFPlug should be extracting automatically. In fact, we decided recently that all plugins should extract file size metadata, so hopefully this will make it into the next release.
In terms of the "number of pages" metadata, luckily this isn't too difficult to add. The pdftohtml program that Greenstone uses creates anchor tags in the HTML output (<a name=1>, <a name=2> etc.) at the start of each page. It is fairly simple to look for these tags and count them to
# Add NumPages metadata (we have "<a name=1>" etc for each page)
If you want to add the file size metadata yourself you'll need to determine the size of the original file, then call add_utf8_metadata as I've done above.
Hope this helps,
Eduardo Tr□pani wrote: