Re: [greenstone-users] PDF full text file and additional HTML abstract(metadata) file

From xiao
DateFri, 1 Jun 2007 21:48:29 +1200
Subject Re: [greenstone-users] PDF full text file and additional HTML abstract(metadata) file
In-Reply-To (004a01c7a346$0d9204e0$4732a8c0-BIB0107)

On 5/31/07, Marcin Malik <> wrote:


I'm preparing Greenstone database of PHD thesis from our university. Full text of thesis are in PDF format and I use Unknown Plug because I don't want to convert files to HTML. But I would like to have apart from Full text PDF file also displayed additional  HTML file with just bibliographic data, abstract and author's keywords all those displayed from metadata. I can create that file and it is shown besides Full Text PDF file, only problem is that in HTML file which displays abstract and keywords from metadata  I have "this document has no text" text that I don't want to see and I don't know how to hide it. I know that this text shows up because I don't use PDF plug but i can't hide this text because command {If}{[Text] ne 'This document has no text. ',[Text]}in the DocumentText format statement doesn't work at all.


And this is an example of  those HTML(metadata) files:



This document has no text.

TITLE: Funkcje osoki aloesowatej (Stratiotes aloides L.) w ekosystemie starorzecza Bugu : rozprawa doktorska / Małgorzata Strzałek ; Akademia Podlaska. Wydział Rolniczy. - 2006. - Promotor: dr hab. Lech Kufel

KYEWORDS: Bug (rzeka); ekosystem



Best Regards


Marcin Malik, University of Podlasie, Poland

greenstone-users mailing list

Hello Marcin,

The reason why the statement {If}{[Text] ne 'This document has no text. ',[Text]} doesn't work is that, the content of [Text] does not only contain the words inside the quote. If you view the source of the web page which displays 'This document has no text. ', you'll see there are tags around the sentence: <Doc> ... </Doc>. You built your collection using the mgpp indexer (right?). MGPP indexer builds indexes based on <Doc> blocks, while mg does not. The statement in question works with mg, but not with mgpp. One way to get around the problem is to condition the statement on a real piece of metadata. Specific to your situation, since you used the unknown plugin to process the pdf files, the following can be used:
{If}{[Plugin] ne 'UnknownPlug',[Text]}


Greenstone Digital Library
New Zealand