pdf "image and text" format rocks - scansoft omnipage does a great job on it too... i think acrobat pro is fine, and can do batch work too, omnipage pro is a bit better imo,
> Date: Tue, 20 May 2008 14:05:03 -0300
> From: email@example.com
> To: firstname.lastname@example.org
> Subject: [greenstone-users] PDFs - OCR metadata - Adobe Acrobat Professional
> Dear Greenstone list,
> I've been trying several OCR software (mostly Open Source or Linux
> oriented) for the past months as I need Greenstone to be able to search
> text over Image PDFs.
> Recently I've realized that Adobe Acrobat Professional has an interesting
> implementation of OCR, such is that it will merge the OCR data within the
> PDF itselft, allowing the user to search (with most PDF readers,
> i.e.Acorbat Reader, KPDF, etc.) and higligthing the results. AFAIK, that
> functionality was available only for text PDFs.
> Greenstone extracts that metadata just fine, so that the users can search
> first inside the collection, download the PDF and then search inside the
> image PDF with the reader.
> Is Adobe Acrobat Professional the sole software that can merge the OCR
> data with the PDF, does anyone know any other software.
> Thanks in advance,
> Diego Casar
> Diego Nicol□s Casar Gonz□lez
> Tel: (+54) 011 5252.0810
> Movil: 15 4186.1334
> Pe□a 2056 : Piso 7 B
> Capital Federal : Argentina
> greenstone-users mailing list
E-mail for the greater good. Join the i?m Initiative from Microsoft.
-------------- next part --------------
An HTML attachment was scrubbed...