Re: [greenstone-users] Greenstone on Debian and DJVU

From Federica Zanardini
DateMon, 05 Jul 2004 19:15:03 +0200
Subject Re: [greenstone-users] Greenstone on Debian and DJVU
In-Reply-To (402BEB34-393C5476-cs-waikato-ac-nz)
Dear all,
I'm sorry but newly I post an old question.

I am trying to construct a digital library containing collection images in djvu format (without text) and I am following the indications of Michael

"If you just want to build an image collection from your DjVU books, then the
problem is a bit different. You'll need some software [...] to write each
of the pages of the book as images. Then you can use the new PagedImgPlug to
build these images into a collection with Greenstone."

but even if I load the "PagedimagedPlug" the images are not recognized by greenstone:
for example this is the error message:
"WARNING: No plugin could recognise 0000-0020/(19010908)36/0000-0020(19010908)036_008.djvu
WARNING: No plugin could recognise 0000-0020/(19010908)36/directory.djvu"
and the link to the images are broken.

If I load the images in an external directory , in web I obtain an message like this:

 "External Link
The link you have selected is external to any of your currently selected collections. If you still wish to view this link and your browser has access to the Web, you can go forward to this page; otherwise use your browsers "back" button to return to the previous document."

How can I load the djvu images in Greenston or alternatively eliminate the message "External link?"

(it does not interest to me to index the text, but to only show the images)

thanks in advance

Federica

At 23.08 12/02/2004, you wrote:
Hi Federica,

> > An other question: it is possible to use the format DjVU for a Greenstone
> > images-collection (on the handbook I have not found it)?
>
> We do not have a greenstone plugin for handling DjVU that I am aware of,
> although I think someone else has raised this in the last few weeks and
> may have done some work on it.
> Someone else might be able to answer this question with more information.

A couple of months ago we looked into DjVU for someone who wanted to use
Greenstone as part of a Bookmobile project. They were interested in making the
books searchable with Greenstone. We found that DjVU books have a "hidden text
layer" that contains the text of the book, as well as the images for the pages
of the book. This hidden text layer is the key for making the books searchable
with Greenstone - you would use this for the full-text search, and display the
original DjVU files to the user (they would need the DjVU viewer installed). The
Lizardtech software includes the ability to export the hidden text layer of
DjVU documents as an XML file (we don't have the software so we couldn't try
this). You would need to do this for each of your DjVU documents, then write a
Greenstone plugin to convert these to Greenstone's Archive format. I don't think
this would be a big job, if you know Perl. We offered to help do this when we
looked into it previously but the person never got back to us about it.

If you just want to build an image collection from your DjVU books, then the
problem is a bit different. You'll need some software (presumably the Lizardtech
software does this, and maybe the open source DjVuLibre program) to write each
of the pages of the book as images. Then you can use the new PagedImgPlug to
build these images into a collection with Greenstone.

Hope this helps,

Michael

--
Federica Zanardini

Divisione Coordinamento Biblioteche
Universita' degli Studi di Milano
Via G.Colombo,46 - 20133 Milano
Italy

Phone:+39-2-503-15218
Fax:+39-2-503-15278
mailto:Federica.zanardini@unimi.it
--------------------------------------------------------------