Re: [greenstone-users] More information about [RelatedDocuments] Formatstring item

From Michael Dewsnip
DateFri, 03 Mar 2006 17:08:16 +1300
Subject Re: [greenstone-users] More information about [RelatedDocuments] Formatstring item
In-Reply-To (44037915-8090207-dlconsulting-co-nz)
Hi,

I've managed to find out a bit more about the "RelatedDocuments"
functionality. It was done at the start of 2001 by a student working for
the NZDL project during the summer. The "related documents" were
determined by generating keyphrases (using Kea) for each document and
seeing how close the keyphrases were. Two documents with very similar
keyphrases are marked as "related".

This code still exists in Greenstone, but it was written for a very
specific situation and was never generalised or tidied up (it won't run
on Windows, for example). It also included some funky code for adding
links into PDF files which is unlikely to work now. In short, it's a
long way from being usable, and it's been suggested that we just scrap
it. Given that you're the first person to ask about it in 5 years, it
doesn't seem unreasonable! Even if we do decide to fix it up, it's
unlikely to happen for a while.

The other option is that we forget about the code for automatically
determining the related documents, but let people specify these manually
by adding "dc.Relation" metadata. This would be pretty tedious, and you
would definitely want to assign your own document IDs to avoid entering
Greenstone's long HASH IDs, but it would be usable.

Regards,

Michael

Richard Managh wrote:

> Hi Ruben,
>
> The runtime code is looking for "relation" metadata, not "dc.relation"
> metadata. As far as I know, the GLI doesnt support adding a metadata
> item called "relation" to documents. I've talked with the developers
> and they have said that they will add the ability for you to use
> dc.relation metadata in the next released version of greenstone.
>
> One way you could tackle this problem would be to explode your
> CDS/ISIS database file using the GLI, if it is an explodable database.
> You can test this by right-clicking on the database file in the gather
> pane, and if a pop-up menu item "explode" is available. If it is, you
> can "explode" the database into a directory that will be named the
> same as the database. There will be something like a sequence of
> 000.nul - XXX.nul files in that directory along with a metadata.xml
> file. You could try editing this file and adding the relationships to
> a "relation" metadata item (that you create) to the relevant records
> listed in the metadata.xml file.
>
> By the way, your examples of contents of the relation metadata seem
> fine. To answer your last question, greenstone creates relations on
> documents with "relation" metadata. I can tell this because if you
> have the source code, in the src/recpt directory there is a function
> "get_related_docs" in formattools.cpp that deals with "relation" metadata.
>
>
> Richard.
> -
> DL Consulting
> Greenstone Digital Library and Digitisation Specialists
> contact@dlconsulting.co.nz
> www.dlconsulting.co.nz
>
>
> ruben pandolfi wrote:
>
>> Thank You Richard,
>>
>> I have seen the faq, but I can not get the relation to show up in the
>> output and I am not sure what to put into relation metadata.
>>
>> This is what I have done:
>>
>> 1 - inserted [RelatedDocuments] variable in "DocumentText" feature
>>
>> 2 - inserted data into the dc.relation field (I'm using dublin core
>> metadata at the moment, but very likely I will add a new metadata set)
>>
>> No matter what I put into dc.relation, I always get:
>>
>> .. no related documents ..
>>
>> I have tried with OID and collection name like this :
>>
>> collectionname HASH011179dd1f0c321aec510320 HASH01f51d867137ebcef699710c
>>
>> or
>>
>> HASH011179dd1f0c321aec510320 HASH01f51d867137ebcef699710c
>>
>> or
>>
>> HASH011179dd1f0c321aec510320
>>
>>
>> Can you give me an example hot to make it work?
>>
>>
>> Also, How can gsdl know wich metadata to look for in order to build
>> relations?
>>
>> Thank you and kind regards,
>>
>>
>>
>>
>>
>> Richard Managh wrote:
>>
>>> Hi Ruben,
>>>
>>> From the faq page:
>>>
>>> http://www.greenstone.org/cgi-bin/library?a=p&p=faqcustomize
>>>
>>> [RelatedDocuments] Related Documents info (if available). This is a
>>> vertical list of Titles (or Subjects if Titles aren't available)
>>> that link to the related documents. It is based on "relation"
>>> metadata, which is a space separated list of collection,OID pairs.
>>>
>>> Currently in the supplied greenstone plugins there is no way of
>>> adding the "related" metadata item to documents when they are
>>> imported. One way of doing this in your case would be to modify the
>>> ISIS plugin so that it does this, that would require expertise in
>>> programming in perl. There are runtime functions to handle this
>>> related metadata.
>>>
>>>
>>> Regards,
>>>
>>>
>>> Richard
>>> -
>>> DL Consulting
>>> Greenstone Digital Library and Digitisation Specialists
>>> contact@dlconsulting.co.nz
>>> www.dlconsulting.co.nz
>>>
>>>
>>> ruben pandolfi wrote:
>>>
>>>> Hi,
>>>>
>>>> on the page:
>>>>
>>>> I have found that a [RelatedDocuments] Formatstring item is available.
>>>>
>>>> I could not find anymore info about it in ML archives , could you
>>>> point me to some more docs ?
>>>>
>>>> Would it be possible to import relations from a cds/isis file using
>>>> ISISplug?
>>>>
>>>> Thanks!
>>>>
>>>> ruben
>>>>
>>>>
>>>>
>>>>
>>>
>>
>------------------------------------------------------------------------
>
>_______________________________________________
>greenstone-users mailing list
>greenstone-users@list.scms.waikato.ac.nz
>https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users
>
>