Hi Jason and Robert
You are both trying to do similar things, so I'll answer you together.
I am not sure if anyone has done something like this before - judging from
the lack of response to Robert's email, I guess not.
Greenstone is not good at associating documents together, so you won't be
able to do what you want without a bit of extra work.
There are several
solutions depending on what you want to achieve.
1. Treat the bibtex entry, the pdf file and the annotation all as separate
documents. They will all be searchable, and you may end up with eg a bibtex
entry and its pdf file both in the results list. If you add metadata to each
item (by editing the bibtex file and writing metadata.xml file for the pdf)
you can then link from one to the other. Eg add a bibtexlink as metadata to
the pdf, so when you display the pdf you can have a link to the bibtex entry,
with href='[bibtexlink]'.
You will need to create this metadata, and
create your own format statements to use it.
2. Combine all the information about one file into a single greenstone
document. This way there is only one item in a search list or browse list per
(pdf/bibtex/annotation) combination. What you need to end up with at the end
of importing is a greenstone archive document with the text of the pdf as the
content, and other bits as metadata, such as the link to the original pdf if
you want that available, all the bibtex fields etc.
What metadata you will
need depends on what you want to be able to search on and display. If you only
want to display the entire bibtex record, and do no searching on the fields,
you could add the record as a single metadata element. If you added each field
as a separate metadata element, then you can do searching/browsing by any of
the fields.
There are two ways I can think of to achieve this.
A. Convert the
bibtex records (and annotations) into metadata.xml files, relating the
appropriate data to each pdf document.
Then use only the pdfs and
the metadata files in the collection.
B. Write a plugin that somehow joins
all the bits together. Eg you could modify the bibtex plugin to look for the
pdf document and do the conversion to html and add that as content. Or modify
the pdf plugin to look for a bibtex entry and add that as metadata.
There
will need to be somewhere something that matches the pdf file to the
appropriate record in the bibtex file.
So there you go. Lots of ideas - I hope you can come up with something that
suits your needs.
Regards,
Katherine Don
"Yao, Jixian (Research)" wrote:
Hi,
I am trying to build my lab's digital library with mainly pdf files.
Most pdf files don't have the fields that we want to search, i.e.
Abstracts,
Notes, etc. We also have bibtex info for these pdf files. The
bibtex file
has a lot more information about the pdf file. I'd like to
build a
collection that combines both pdf and biblio info (the pdf alone
is
searchable), so that when a user search, say, a title, it'll display
the
biblio record, and also has a link pointing to the pdf file.
Greenstone seems to treat .bib and .pdf separately unrelated, and even I
add
an URL in bibtex file, it would not display as a link but a plain
text (not
clickable).
Any suggestions would be appreciated. (I am new to Greenstone)
Thank,
Jason
rfergu@music.mcgill.ca wrote:
Dear List,
I have several pdfs of papers, their bibtex entries, and some annotations I wrote. I would like to
keep these all as a greenstone database (so I can search the files and view them as html). I have
seen people periodically post similar interest on this list.
I would prefer to not reinvent the wheel. Has anyone done this, and mind showing me your
example?
Kind Regards,
Robert Ferguson