RE: [greenstone-users] Tool for writing item files - PagedImagecollection - pagedImage plugin -

From Diego Spano
DateThu, 4 Jan 2007 13:58:02 -0300
Subject RE: [greenstone-users] Tool for writing item files - PagedImagecollection - pagedImage plugin -
In-Reply-To (BAY120-F214F5A94229CF276B54C34C9B80-phx-gbl)
Anson, you are right, PDF is a very good solution, for small documents.
What about the file size of a pdf containing 500 images? Suppose you publish
your collection in Internet, how much time will you spend downloading the
file when you want to view the document? Web server must send you the entire
pdf file. If you use Pagedimgfile, the web server only send you the page you
require, one by one....

-----Mensaje original-----
[] En nombre de Anson
Enviado el: Jueves, 04 de Enero de 2007 12:42 p.m.
Asunto: RE: [greenstone-users] Tool for writing item files -
PagedImagecollection - pagedImage plugin -

Hi Tomas,
if you have access to OCR software you may save files as PDF "Image and
Text". This solution works great - it even keep s the location of the text
underneath the OCR'd image as a layer, allowing you to do "cut and paste" on
the image document and get the text for your notes..... great if you're
starting out fresh.

If you've already spent hours on OCR you should see what Don Diego can do
for you :) ap


From: "Diego Spano" <>
To: 'Tomá¹ Fiala'
Subject: RE: [greenstone-users] Tool for writing item files - Paged
Imagecollection - pagedImage plugin -
Date: Thu, 4 Jan 2007 10:29:10 -0300

Hi Tomas, I developed a little program that creates .item files with the
metadata you want to assign. With this program we have procesed thousands of
images. How does this program work? Suppose you have this folder structure:

\import\doc 1\image1.tif
\import\doc 1\image1.txt
\import\doc 1\image2.tif
\import\doc 1\image2.txt
\import\doc 1\image3.tif
\import\doc 1\image3.txt
\import\doc 2\image1.tif
\import\doc 2\image1.txt
\import\doc 2\image2.tif
\import\doc 2\image2.txt
\import\doc 3\image1.tif
\import\doc 3\image1.txt
\import\doc 3\image2.tif
\import\doc 3\image2.txt
\import\doc 3\image3.tif
\import\doc 3\image3.txt

You have folders and inside them you have tiffs and txt files. Both files
have the same name and no matter how many files you have inside them. In the
example i named the files imagex.txt and imagex.tif but you can put the name
you want, you only have to take in account that image and text files MUST
have the same name.

The program will create a.item file inside doc1, doc2 and doc3 folders. Is a
..exe file, so you have to run it in Windows. If your scenario is like mine,
then I will send you the executable file. Let me know!!!


Diego Spano
Archivo Digital
Secretaria de DD. HH.
Ministerio de Justicia y DD. HH.
Tel.: 5167-6550

-----Mensaje original-----
[] En nombre de Tomá¹
Enviado el: Jueves, 04 de Enero de 2007 07:32 a.m.
Asunto: [greenstone-users] Tool for writing item files - Paged Image
collection - pagedImage plugin -


I am creating pagedimage collection. I have a book with 500 pages
(Images+OCR) and its very uneasy to write .item files manually.

Please, does anyone know a tool which generates the .item files
automatically ???

What are the other ways of putting OCR+Image files together ?

Many thanks for your help !


Tomas Fiala

greenstone-users mailing list

The MSN Entertainment Guide to Golden Globes is here. Get all the scoop.