Re: Greenstone

From Gordon Paynter
DateThu, 30 Jan 2003 13:51:47 +1300 (NZDT)
Subject Re: Greenstone
CCed to the Greenstone mailing list.

"Dollberg, Donald D." <ddd1@cdc.gov> wrote:

> [...] I want to create a collection of
> journal articles written in my group and use Greenstone simply to create
> a bibliography that is linked to the actual article in PDF format. I
> have the bibliography in bibtex format and the articles will be scanned
> and converted to PDF. I am not interested in indexing the actual
> article just in the display of the article.

Hi Donald,

What you're describing is feasable, and can be accomplished by creatijng a
collection based on your BibTexd and hosting the PDF files seperately.
You will need to

1. Include some metadata in the BibTex file that can be used to construct
the URL for a specific PDF document. The easiest way is if you have a URL
fieldin your BibTex, but there are other ways (for example, if your PDF
files were hosted at http://example.org/pdf/[docnumber].pdf, then you
would somehow need to store [docunumber] metadata in each BibTex record.

2. Provide your collection with specialised format strings that cause the
PDF file to be displayed instead of the BibTex record data.


> I assume that keywords
> entered into the bibtex file can be read by the bibtex plugin to create
> a search index. I have looked at some of the config files so I have a
> starting point but the overall setup is still not clear to me.

The BibTex plugin code is in $GSDLHOME/perllib/plugins/BibTexPlug.pm.
On about line 90 there's a list of all the fields that BibTexPlug reads.
It starts "my %field =" and contains a set of BibTex fields and the
Greenstone fields they map to. The "keywords" BibTex field is included as
Keywords metadata. (Note: I don't remember doing this when I wrote the
original plugin; Stef or John may have added it since?)

>From glancing at the code, it seems that only these fields are added as
metadata, plus the whole record is added as the document's Text. If you
have specialised fields which are not on this list, you should be able to
simply extend the list in the same format.


> I assume that the bibtex file would go into the import directory to be
> read by import.pl. Where would the pdf's go for the display part and
> how are the two linked?

You're correct that the bibtex file woulf go in the import directory. You
will need to put the BibTex plugin in your colect.cfg file. As mentioned
above, the PDFs are not directly linked into the collection.


There are probably other ways to do this, but they'll probably involve
writing plugin code.

Gordon