Re: [greenstone-users] Questions about building collections

From Michael Dewsnip
DateTue, 14 Feb 2006 12:15:39 +1300
Subject Re: [greenstone-users] Questions about building collections
In-Reply-To (s3f09d8c-090-state-tn-us)
Hi Kelly,

> I am coming up with an error that says Error:PDF version 1.5 - - xpdf
> supports version 1.4 what do I need to do to fix this.

Unfortunately the PDF processing package included with Greenstone
(pdftohtml) hasn't been updated for some time, and doesn't support newer
PDFs. You could try opening the PDFs in Acrobat and saving them as an
older version (I've never tried this so I'm not sure if it's even possible).

If that doesn't work things start to get a bit messy. If you don't care
that no text is extracted you can process the PDFs using UnknownPlug, or
use "pagedimg_jpg" for PDFPlug's -convert_to option to get an image for
each page of the document. Otherwise, you'll probably have to save the
PDFs as HTML from within Acrobat and then associate the PDFs with the
HTML files (we can give you more details about this if you get to it).

> Also I am getting an error that says : AZList: HASH then a number
> metadata has become empty. What do I need to do to fix these errors.

This is just a warning, and means that the specified document has no
metadata value for the classifier being built. For example, you might be
building a classifier on titles, but one document has no title metadata.