Re: [greenstone-users] Greenstone on Debian and DJVU

From Michael Dewsnip
DateFri, 13 Feb 2004 10:08:04 +1300
Subject Re: [greenstone-users] Greenstone on Debian and DJVU
In-Reply-To (20040212190108-GA3094-wesson-cs-waikato-ac-nz)
Hi Federica,

> > An other question: it is possible to use the format DjVU for a Greenstone
> > images-collection (on the handbook I have not found it)?
> We do not have a greenstone plugin for handling DjVU that I am aware of,
> although I think someone else has raised this in the last few weeks and
> may have done some work on it.
> Someone else might be able to answer this question with more information.

A couple of months ago we looked into DjVU for someone who wanted to use
Greenstone as part of a Bookmobile project. They were interested in making the
books searchable with Greenstone. We found that DjVU books have a "hidden text
layer" that contains the text of the book, as well as the images for the pages
of the book. This hidden text layer is the key for making the books searchable
with Greenstone - you would use this for the full-text search, and display the
original DjVU files to the user (they would need the DjVU viewer installed). The
Lizardtech software includes the ability to export the hidden text layer of
DjVU documents as an XML file (we don't have the software so we couldn't try
this). You would need to do this for each of your DjVU documents, then write a
Greenstone plugin to convert these to Greenstone's Archive format. I don't think
this would be a big job, if you know Perl. We offered to help do this when we
looked into it previously but the person never got back to us about it.

If you just want to build an image collection from your DjVU books, then the
problem is a bit different. You'll need some software (presumably the Lizardtech
software does this, and maybe the open source DjVuLibre program) to write each
of the pages of the book as images. Then you can use the new PagedImgPlug to
build these images into a collection with Greenstone.

Hope this helps,