|
Hello Xiao
Thank you for your help, using sent by you format
statement solved the problem, all works fine. And yes we are using mgpp
indexer.
Once again thank you very much.
Best regards
Marcin Malik, University of Podlasie Main Library,
Poland
----- Original Message -----
Sent: Friday, June 01, 2007 11:48
AM
Subject: Re: [greenstone-users] PDF full
text file and additional HTML abstract(metadata) file
On 5/31/07, Marcin
Malik <maliczek1@tlen.pl>
wrote:
I'm
preparing Greenstone database of PHD thesis from our university. Full text
of thesis are in PDF format and I use Unknown Plug because I don't want to
convert files to HTML. But I would like to have apart from Full text PDF
file also displayed additional HTML file with just
bibliographic data, abstract and author's keywords all those displayed from
metadata. I can create that file and it is shown besides Full
Text PDF file, only problem is that in HTML file which displays abstract and keywords from metadata I have "this document
has no text" text that I don't want to see and I don't know how to hide it.
I know that this text shows up because I don't use PDF plug but i can't hide
this text because command {If}{[Text] ne 'This document has no text.
',[Text]}in the DocumentText format statement doesn't work at
all.
-
And this is an example
of those HTML(metadata) files:
------------------------------------------------------------------------------------------------------------
This
document has no text.
TITLE: Funkcje osoki aloesowatej (Stratiotes
aloides L.) w ekosystemie starorzecza Bugu : rozprawa doktorska / Małgorzata
Strzałek ; Akademia Podlaska. Wydział Rolniczy. - 2006. - Promotor: dr hab.
Lech Kufel
KYEWORDS: Bug (rzeka); ekosystem
Best Regards
Marcin Malik, University of Podlasie,
Poland _______________________________________________ greenstone-users
mailing list greenstone-users@list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users
Hello Marcin,
The reason why the
statement {If}{[Text] ne 'This document has no text. ',[Text]}
doesn't work is that, the content of [Text] does not only contain the words
inside the quote. If you view the source of the web page which displays 'This
document has no text. ', you'll see there are tags around the sentence:
<Doc> ... </Doc>. You built your collection using the mgpp indexer
(right?). MGPP indexer builds indexes based on <Doc> blocks, while mg
does not. The statement in question works with mg, but not with mgpp. One way
to get around the problem is to condition the statement on a real piece of
metadata. Specific to your situation, since you used the unknown plugin to
process the pdf files, the following can be used:
{If}{[Plugin] ne
'UnknownPlug',[Text]}
Best xiao
-- Greenstone Digital
Library New Zealand |