[greenstone-users] RE: [greenstone-devel] TIFF plugin?

From Diego Spano
DateMon, 31 Jan 2005 11:21:36 -0300
Subject [greenstone-users] RE: [greenstone-devel] TIFF plugin?
In-Reply-To (Pine-SGI-4-10-10501221006570-933715-100000-alexia-lis-uiuc-edu)
Karen, Tim:

If you have tiff images a text from OCR, you have to use PagedImgPlug.
It works ok. We use it a lot (Almost 500.000 scanned pages are managed
into our greenstone collections).


It is simple: put the tiff files and txt files in a folder, and inside
it you need to create some .item files, eg doc.item.
This .item file contains metadata for the doc and a list of the image
and text
files that make up the document.
PagedImgPlug then processes these item files, linking the images into a
single document.
There are options to the plugin for creating thumbnails or small preview

size images from the main page images.

There are some brief instructions in the header of the plugin file.

Currently PagedImgPlug assumes that the main pages are images - if you
have text files instead you may need to modify the plugin. But it would
be a good place to start.

Note, PagedImgPlug is availablein the Greenstone 2.50 release.

The format of the .item file is: first all metadata fields, and then one
line for each image with this format:
Page_number:image_file_name:text_file_name:

<Title>Document 1
<Subject>Test
1:F6N869510.TIF:F6N869510.txt:
2:F6N869511.TIF:F6N869511.txt:
3:F6N869512.TIF:F6N869512.txt:
4:F6N869513.TIF:F6N869513.txt:
5:F6N869514.TIF:F6N869514.txt:
6:F6N869515.TIF:F6N869515.txt:
7:F6N869516.TIF:F6N869516.txt:
8:F6N869517.TIF:F6N869517.txt:
9:F6N869518.TIF:F6N869518.txt:
10:F6N869519.TIF:F6N869519.txt:


Since you are working with tiff files you need a viewer that let the
browser display the image. Take a look at www.alternatiff.com (A TIFF
image viewer for Windows web browsers). Then, in collect.cfg you need to
add this line to view the images:

format DocumentText
"<center><b>[parent:Title]</b></center><br><br><br><table border=0
align=center WIDTH=750><tr><td align=center><embed width=550 height=950
src=_httpcollection_/index/assoc/[parent:assocfilepath]/[Image]
type=image/tiff toolbar=top></td></table>
[Text]"

And that´s all !!!!

Hope this helps.

Diego Spano
Archivo Digital
Secretaria de DD. HH.
Ministerio de Justicia y DD. HH.
Tel: 4382-6404
djspano@jus.gov.ar


-----Mensaje original-----
De: greenstone-devel-bounces@list.scms.waikato.ac.nz
[mailto:greenstone-devel-bounces@list.scms.waikato.ac.nz] En nombre de
Karen E. Medina
Enviado el: Sábado, 22 de Enero de 2005 01:08 p.m.
Para: Tim Finney
CC: greenstone-devel@list.scms.waikato.ac.nz
Asunto: Re: [greenstone-devel] TIFF plugin?


excellent. this is exactly what I'll be doing with a project coming up.

-karen medina

On Fri, 21 Jan 2005, Tim Finney wrote:

> Dear All
>
> At work we have a scanner that creates TIFF files of documents and
> somehow puts an OCRed text inside the TIFF file. It would be super
cool
> to have a plugin that allows files of this sort to be imported.
>
> It would be ideal to be able to build a collection of such documents
> indexed according to the OCRed texts but with a link to the
> corresponding image. This would make it easy to make electronic
> archives.
>
> (There must be a lot of places that want to could all of their
archives
> and incoming correspondence and then build a Greenstone collection to
be
> able to search them.)
>
> Best
>
> Tim Finney
>
>
> _______________________________________________
> greenstone-devel mailing list
> greenstone-devel@list.scms.waikato.ac.nz
> https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-devel
>

-Karen Medina
--------------------------->>>>>***<<<<<---------------------------
"I often store draft copies of my documents in trifles for security
purposes" -stolen from a webpage

------------------>>>***<<<------------------


_______________________________________________
greenstone-devel mailing list
greenstone-devel@list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-devel