|Rich, it all depends on the kinds of documents you have. If you have
newspapers, invoices, profiles, etc, it is better to scan them in tiff
format, black and white, 200/300 dpi and compression type G3 or G4 (the
same compression algorithm used in fax transmissions). This
configuration will give you good results if you OCR these images and is
the best way to have very high quality images and very low file size. If
you save black and white images in jpeg format you will have bigger
files: jpeg works better with color images.
[mailto:email@example.com] En nombre de
Enviado el: Lunes, 14 de Febrero de 2005 07:22 p.m.
Asunto: RE: [greenstone-users] Basic questions about Greenstone
Is there a particular advantage to making page
images to be TIFFs as opposed to JPEGs (which are
handled by all browsers on all platforms)?
>Greenstone will handle this amount of files without problems. We work
>with a lot of images, all in tiff format (G4 compression). We do OCR to
>all pages (without human re-keying) and import them into Greenstone
>using PageImgPlug. You don¥t need to convert to gif or png format. You
>only need to install a tiff plugin to the user to let the browser show
>the images. Try www.alternatiff.com to get a free plugin. Don¥t worry
>about the images volume. What it really matters is txt volume. You can
>have 1 TB of images but perhaps only a few GB of text. What Greenstone
>will index are the text files, and I think that the limit is many GB.
greenstone-users mailing list