[greenstone-users] PDFs - OCR metadata - Adobe Acrobat Professional

From Diego Nicolás Casar González
DateWed May 21 05:05:12 2008
Subject [greenstone-users] PDFs - OCR metadata - Adobe Acrobat Professional
Dear Greenstone list,

I've been trying several OCR software (mostly Open Source or Linux
oriented) for the past months as I need Greenstone to be able to search
text over Image PDFs.

Recently I've realized that Adobe Acrobat Professional has an interesting
implementation of OCR, such is that it will merge the OCR data within the
PDF itselft, allowing the user to search (with most PDF readers,
i.e.Acorbat Reader, KPDF, etc.) and higligthing the results. AFAIK, that
functionality was available only for text PDFs.
Greenstone extracts that metadata just fine, so that the users can search
first inside the collection, download the PDF and then search inside the
image PDF with the reader.

Is Adobe Acrobat Professional the sole software that can merge the OCR
data with the PDF, does anyone know any other software.

Thanks in advance,
Diego Casar

Diego Nicol□s Casar Gonz□lez
Tel: (+54) 011 5252.0810
Movil: 15 4186.1334
Pe□a 2056 : Piso 7 B
Capital Federal : Argentina