What a great application for Greenstone!
I am pretty sure that ImagePlug doe not extract text embedded in TIFF
You will have do download and install ImageMagic to try it out.
OCR software often has an option to make PDF files - so you could use
PDFPlug if that is the case.
I have successfully done this - you may need to use the '-complex' flag
and load ghostscript.
BTW You could use it with EMAILPlug to get all your incoming correspondence searchable - maybe use WordPlug for attachments.
Stephen De Gabrielle
Tim Finney <email@example.com>
Sent by: firstname.lastname@example.org
21/01/2005 01:09 PM
Subject: [greenstone-devel] TIFF plugin?
At work we have a scanner that creates TIFF files of documents and
somehow puts an OCRed text inside the TIFF file. It would be super cool
to have a plugin that allows files of this sort to be imported.
It would be ideal to be able to build a collection of such documents
indexed according to the OCRed texts but with a link to the
corresponding image. This would make it easy to make electronic
(There must be a lot of places that want to could all of their archives
and incoming correspondence and then build a Greenstone collection to be
able to search them.)
greenstone-devel mailing list