[greenstone-users] Subject: PagedImagePlugin: mapping a scanned image file with two text files using XML based ITEM file

From Yohannes Mulugeta
DateWed Dec 8 02:36:47 2010
Subject [greenstone-users] Subject: PagedImagePlugin: mapping a scanned image file with two text files using XML based ITEM file
In-Reply-To (4CFC1039-6050705-cs-waikato-ac-nz)
Dear Katherin,

I have implemented using the 3rd option, two item files (amharic.ITEM and
tigrigna.ITEM) based on your suggestion for a single image file and I have
tried to modify the 'advanced scanned image' example as follows.

1) This is is collection specific macro to display the image (in full size
and preview) and the translated texts in Amharic and Tigrigna:
_textfullsize_ {FULLSIZE}
_textpreview_ {PREVIEW}
_textfulltextAmh_ {TEXT-Amharic}
_textfulltextTmh_ {TEXT-Tigrigna}

_textviewfullsize_ {View the fullsize image of this page}
_textviewpreview_ {View the preview image of this page}
_textviewfulltextAmh_ {View the text of this page in Amharic}
_textviewfulltextTig_ {View the text of this page in Tigrigna}

_viewfullsize_ {<div class="button"><span class="button"><a
_viewpreview_ {<div class="button"><span class="button"><a
_viewtextAmh_ {<div class="button"><span class="button"><a
_viewtextTig_ {<div class="button"><span class="button"><a
href="/gsdlmod?e=d-00000-00---off-0gsarch--00-0----0-10-0---0---0direct-10---4-----dfr--0-1l--11-en-50---20-about-Xiao+Hu--00-0-1-00-0--4----0-0-11-10-0utfZz-8-00&a=d&d=[TigrignaTranslation] t"

2) In the Format Feature. I used the following in the Document heading

{If}{[TigrignaTranslation] ne '1',_document:viewtextAmh_}
{If}{[AmharicTranslation] ne '1',_document:viewtextTig_}

3) In the DocumentText of the Format feature:
{If}{about eq 'fullsize',[srcicon],
{If}{about eq 'preview',[screenicon],
{If}{[TigrignaTranslation] ne '1', [AmharicTranslation],
{If}{[AmharicTranslation] ne '1', [TigrignaTranslation],[screenicon]}}}}

I have already already selected the check boxes for OIDtype to be assined
..... and the OIDmetadata to be dc.identifier in the PagedImagePlugin

But I have faced the following problems:
1) [AmharicTranslation] and [TigrignaTranslation] can't point to the content
of the translated text file -- it points to nothing. How can I can display
the content of the translated texts.
2)Title repetition because of the tow item files. How can I avoid title

Any help please?
Yohannes Mulugeta
Addis Ababa University

On Sun, Dec 5, 2010 at 2:20 PM, Katherine Don <kjdon@cs.waikato.ac.nz>wrote:

> Hi
> Currently there is no way to include 2 text files per image. I have added
> this as a ticket on our bug/request tracking system:
> http://trac.greenstone.org/ticket/722
> In the meantime, here are a few options.
> 1. If you know perl you can modify the plugin to process two text files per
> page.
> 2. Merge the two translations into a single text file.
> 3. Create 2 item files per manuscript. They will contain the same image
> series, but one will be for Tigrigna texts and the other for Amharic texts.
> If you add a unique identifier metadata, then you can manually link between
> the two versions.
> eg in the tigrigna item file, set metadata 'dc.Identifier' to be
> 'doc1-tig', and 'AmharicTranslation' metadata to be 'doc1-amh'
> and then in the amharic version, metadata 'dc.Identifier' to be
> 'doc1-amh', and 'TigrignaTranslation' metadata to be 'doc1-tig'
> When you build the collection, you need to set the options -OIDtype
> assigned -OIDmetadata dc.Identifier to PagedImagePlugin.
> Then in a format statement if you want to link to the other version, you
> can do something like
> {If}{[AmharicTranslation],<a
> href="/gsdlmod?e=d-00000-00---off-0gsarch--00-0----0-10-0---0---0direct-10---4-----dfr--0-1l--11-en-50---20-about-Xiao+Hu--00-0-1-00-0--4----0-0-11-10-0utfZz-8-00&a=d&d=[AmharicTranslation]">Amharic version</a>}
> {If}{[TigrignaTranslation],<a
> href="/gsdlmod?e=d-00000-00---off-0gsarch--00-0----0-10-0---0---0direct-10---4-----dfr--0-1l--11-en-50---20-about-Xiao+Hu--00-0-1-00-0--4----0-0-11-10-0utfZz-8-00&a=d&d=[TigrignaTranslation]">Tigrigna version</a>}
> I hope this helps
> Regards,
> Katherine
> Yohannes Mulugeta wrote:
> Hi,
> I'm building a digital library collection on scanned pages of Ethiopian
> ancient parchment manuscripts. The manuscripts are in Ethiopian ancient
> language called Geez and are being translated in to local languages:
> Tigrigna and Amharic. I'm building the digital library as PagedImage
> collections with the translated texts. So that one can see the scanned pages
> with corresponding translated texts and can perform search in the texts. I
> am using the PagedImagePlugin to precess the files. I have image collections
> in one folder , translated texts in another and the ITEM file in XML as
> flows:
> C:imagesPage1
> C:imagesPage2
> .
> .
> .
> C: extAmharicPage1.txt
> C: extAmharicPage2.txt
> .
> .
> .
> C: extTigrignaPage1.txt
> C: extTigrignaPage2.txt
> .
> .
> .
> C:itemfile.ITEM
> I want to map an image (a scanned page) say Page1 with with two text files
> of the translated page, say textTigrignaPage1.txt and textAmharicPage1.txt
> So that the user can see the scanned page and its content in two languages.
> So my question is how can I map the image file with tow text files in the
> xml based ITEM file? By the way I'm following the examples available online
> (Advanced scanned image collection). So how can I do that? Any help please?
> Thanks,
> Yohannes Mulugeta
> Computer & Information Retrieval Center
> Addis Ababa University Library System
> P.O.Box 1176
> Addis Ababa, Ethiopia
> cell phone +251-(0)911 395 572
> e-mail: yohannesmulu@gmai.com
> yohannes@lib.aau.edu.et
> ------------------------------
> _______________________________________________
> greenstone-users mailing listgreenstone-users@list.scms.waikato.ac.nzhttps://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users

Yohannes Mulugeta
Computer & Information Retrieval Center
Addis Ababa University Library System
P.O.Box 1176
Addis Ababa, Ethiopia

cell phone +251-(0)911 395 572
e-mail: yohannesmulu@gmai.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://list.scms.waikato.ac.nz/mailman/private/greenstone-users/attachments/20101207/ec5c93c1/attachment.html