Re: [greenstone-users] Collage; Pdf files

From Michael Dewsnip
DateFri, 16 Dec 2005 16:24:33 +1300
Subject Re: [greenstone-users] Collage; Pdf files
In-Reply-To (OF45080492-0DBF4F68-ONCC2570CE-00783B46-CC2570D6-000184B3-niro-dk)
Hi Svetlana,

>1 - PDF files. The collection search would return the pdf files
>represented as text & pictures extracted separately (one icon), and as a
>full pdf document (second icon). The first type comes up OK. The second
>(full pdf file) comes up with the error message: "The file is damaged and
>could not be repaired". (The actual file is fine, and not damaged).
>Other pdf files start download through the web browser and end up just as
>a blank page and message "Done" in the bottom of the browser window. What
>could be wrong?
When you click on the PDF icon to access the PDF file you are accessing
this file directly (without Greenstone's intervention), so either your
browser is handling these badly (can you view PDFs on other sites OK?)
or the PDFs in the built collection have been corrupted somehow. This is
unlikely, but you should look in the "index/assoc" directory of the
collection and check the doc.pdf files in the HASH directories (by
opening them) to make sure they are undamaged.

>2 - Is there a way of showing PDF files in the search results as a
>thumbnail (similar to image files?)
This will require a bit of manual work on your part, or some changes to
PDFPlug (which will require some Perl programming knowledge).

You can manually open the PDFs and save the first page as a JPEG image,
then scale it to the correct size. You'll need to name this image the
same as the PDF (eg. if the pdf is "document1.pdf" then the image must
be "document1.jpg"), and put it in the import directory along with the
PDF. Then re-import and re-build your collection.

Lastly, you'll need to change the formatting to display this image
instead of the PDF icon. In the "Format Features" part of GLI's Design
pane, edit the VList format statement and replace the "[srcicon]" bit
with "{If}{[hascover],<img src='[DocImage]'>,[srcicon]}".

>3 - Collage. I did not find much correspondence in the Greensone-users
>conference. Where can I read more documentation on setting to work right?
>(At the moment it is really slow; I cannot work it out, how it actually
>classifies the documents - it gives some cascading structure which does
>not seem logical...)
The Collage applet seems to be in a bad state at the moment. I'm
planning to do some work on it next week, and I'll let you know when a
new version is ready.