Re: HTMLPlug,page turning,page images, and alternate versions of the document.

From Tod Olson
DateFri, 29 Nov 2002 09:20:35 -0600
Subject Re: HTMLPlug,page turning,page images, and alternate versions of the document.
In-Reply-To (OF3B0E8F05-22373183-ON69256C80-0007B76D-69256C80-0007B77F-ntu-edu-au)
>>>>> "S" == Stephen DeGabrielle <Stephen.DeGabrielle@ntu.edu.au> writes:

S> I am trying to make a collection of digitised books (scanned and
S> ocr'd to create text).

S> I am hoping to use the HTMLPlug to do this.

S> My first problem is I want my documents to be a sequence of pages-
S> but my documents seem to always to be hierarchical in structure -
S> from reading the manual I believe that hierarchical documents cause
S> greenstone to display a TOC while non-hierachical documents cause
S> the 'Next/Previous' arrows to be available. (see format
S> DocumentContents true/false page 43 developers manual).

The DocumentContents seems to turn on or off document navigation,
period. It also seems that if the Title metadata is all page numbers,
then you get the prev/next arrows rather than a TOC display. You
might experiment with that with a small sample collection.

As an aside, I see your document structure is basically:

<Section>...</Section>
<Section>...</Section>
<Section>...</Section>

I had some problems loading documents that way. If that turns out to
be a problem for you, you might try:

<Section>
<Section>...</Section>
<Section>...</Section>
<Section>...</Section>
</Section>

S> Secondly, I am trying to get them to refer to 'alternate versions';
S> - the image (jpg) of the page
S> - a pdf of the whole document

S> I believe I have done this properly by including these filenames in
S> my page metadata and in the metadata for each page. But Weirdly it
S> only works for one document and then - only for the pdf. (and only
S> on the first page)
S> --
S> format DocumentText '<p><a href="/gsdl/collect/gsarch/index/assoc//[PageImage]">View page
S> image</a><br><a href="/gsdl/collect/gsarch/index/assoc//[AlternateVersion]">PDF Version </a>
S> <p>[Text]'
S> # where [PageImage] is some metadata element set as described above.

Look in your collection's archive directory. Check the archive
documents that are created, make sure that all of the sections are
present as you expect; if not, you may need to experiment with your
source documents.

I will need to deal with a similar issue. For my stuff, the user
should be able to click on "JPEG version" or "DjVu version" and
continue to get those images until they actively select the other
version. I'm thinking of two schemes: (1) add an argument in main.cfg
and use that as a image format selector, or (2) use two section
hierarchies, and roll my own navigation. Either will involve a
substantial amount of custom macros.

S> PS I can't get 'format DocumentArrowsBottom true' to make arrows
S> show up either.

This seems to have an effect only DocumentContents displays arrows.

S> PPS I can't use PDFPlug as the documents will be too big, and my
S> clienst will slow old computers and low speed conections.
S> Supplying the link to the pdf is a desirable but optional extra.

In the archive document, there might be some SourceDocument metadata
that you could use to create that link.

Tod A. Olson <tod@uchicago.edu> "How do you know I'm mad?" said Alice.
Programmer / Analyst "If you weren't mad, you wouldn't have
The University of Chicago Library come here," said the Cat.