[greenstone-users] Re: Fwd: Tests with Luigi

From Greenstone Team
DateThu Aug 11 21:58:35 2011
Subject [greenstone-users] Re: Fwd: Tests with Luigi
In-Reply-To (4E3FA94F-5090301-free-fr)
Hi John

> I guess I could try installing on a second computer at home, but
waiting for your reply.
I tried it here with the Windows Vista machine set up as the Greenstone
OAI server and the linux machine trying to download stuff over OAI from
that server.
I have not tried it on the same machjine.

> http://localhost:2028/greenstone/cgi-bin/oaiserver.cgi
Is that because that port number may be in the restricted port range?
Could you try again with another port number. You can set your server to
another port via the File > Settings menu of the little Greenstone
Server Interface dialog that appears when you run GLI or run the
gs2-server script.

> I also see no gs.OAI Resource URL metadata in the Enrich panel after
I build a collection. If it is not shown to the collection builder, then
hard to see why the metadata name is in the panel.

This is user specified information: you will need to fill it in with a
URL. Greenstone will use that URL to override the default URL to the
document that it would have generated. Try typing Google's full address
in it for instance, and then building the collection.

> Also when I add the following line to Vlist (between the last td
brackets):
<br>OAI url: [gs.OAIResourceURL]<br>dc.Identifier: [dc.Identifier]
I get the text ("OAI url:" and "dc.Identifier") displayed but no url
metadata.

Try filling some data for gs.OAIResourceURL and/or dc.Resource
Identifier in the Enrich pane and building that updated collection
first, to get the newly-assigned metadata processed. Then check once
more to see if the metadata finally displays as specified in your format
statement. If it still doesn't appear, please write back.

> The above notwithstanding, I would like you to confirm my present
understanding of gs.OAI Resource URL:
* This is an internal mechanism to enable GLI to keep track of user
preferences for OAI document location presentation.
* It has no impact on the end-user's vision of the document through OAI
harvest (i.e. whether the url comes from an actual dl.Identifier or a
gs.OAIResourceURL or both, it will appear to the user as
oai_dc.Identifier. Right?

If you're writing a manual and want my confirmation, I will have to show
Dr Bainbridge your statements to get them approved. However, if these
are not official statements, just descriptions for you to confirm
against my own understanding whether you have grasped the
gs.OAIResourceURL, I think you have.
For the first point, I would additionally say that it is a field for
overriding the default Greenstone generated URL to a source document. It
does not override any URLs you may type in dc.Resource Identifier, and
the OAI spec does not seem to specify any ordering or any way to work
out ordering.


Take care,
Anupama


John Rose wrote:
> Thanks for this Anupama.
>
> I am beginning to understand a bit better. I have downloaded caveat
> emptor 2.85 (Linux version, still under Ubuntu 10.04), and tried to
> use GLI to harvest OAI data from one just-built collection into
> another new one on the same computer. This does not seem to work, says
> something like no information on the server when I specify
> http://localhost:2028/greenstone/cgi-bin/oaiserver.cgi , could you
> advise?
>
> So I am as of yet unable to fully test this version since there are no
> operational OAI 2.85 servers on the net - I guess I could try
> installing on a second computer at home, but waiting for your reply.
>
> I also see no gs.OAI Resource URL metadata in the Enrich panel after I
> build a collection. If it is not shown to the collection builder, then
> hard to see why the metadata name is in the panel.
>
> Also when I add the following line to Vlist (between the last td
> brackets):
> <br>OAI url: [gs.OAIResourceURL]<br>dc.Identifier: [dc.Identifier]
> I get the text ("OAI url:" and "dc.Identifier") displayed but no url
> metadata.
>
> Is the above normal?
>
> The above notwithstanding, I would like you to confirm my present
> understanding of gs.OAI Resource URL:
> * This is an internal mechanism to enable GLI to keep track of user
> preferences for OAI document location presentation.
> * It has no impact on the end-user's vision of the document through
> OAI harvest (i.e. whether the url comes from an actual dl.Identifier
> or a gs.OAIResourceURL or both, it will appear to the user as
> oai_dc.Identifier. Right?
>
> Thanks and best regards, John
>
> On 05/08/2011 07:47, Greenstone Team wrote:
>> Hi John,
>>
>> This message is in response to your OAI questions when pure dublin core
>> and CD/ISIS collections are concerned.
>> The outcomes of both are dependent on the same things.
>>>
>>>>
>>>>
>>>> 2. The other part of your question that still remains is whether any
>>>> multiple metadata for dc.Resource Identifier should all be retained.
>>>>
>>>> Just now, when trying out a few things, we found that the current
>>>> behaviour is as follows:
>>>> - if you had a manually assigned value in gs.OAI Resource URL, then
>>>> any
>>>> automatically generated one is ignored
>>>> - if you have a manually assigned value in gs.OAI Resource URL AND you
>>>> have one or more manually assigned values in dc.Resource Identifier,
>>>> then ALL of them are visible.
>>>> - if you have one or more manually assigned values in the "dc.Resource
>>>> Identifier" field, but none in "gs.OAI Resource URL", then the
>>>> automatically generated Greenstone metadata is not discarded, since
>>>> this
>>>> is a URL to the original document (or it is the URL to the
>>>> Greenstone-generated HTML version of the document, if there was no
>>>> original document to link to).
>>> I am worried about the users of CDS/ISIS compatible software who use
>>> their bibliographic databases to create Greenstone collections. In the
>>> CDS/ISIS to Greenstone guidelines
>>> (http://wiki.greenstone.org/wiki/gsdoc/others/CDS-ISIS_to_DL.doc), we
>>> explain in chapter 2 how to explode a CDS/ISIS database and
>>> automatically insert the fixed url from a CDS/ISIS field into a field
>>> called exp.xxx [where xxx is the name of the CDS/ISIS field with the
>>> document url]. I understand that if they begin to serve under OAI,
>>> they should in the oai.cfg file map this field (if it exists) into
>>> gs.OAI Resource URL (and not into dc.Resource Identifier). Could you
>>> verify that this mapping works?
>>>
>>> On the other hand, what about users who set up pure Dublin Core
>>> digital libraries with a fixed url in dc.Resource Identifier. I guess
>>> that they would then have to map dc.Resource Identifier into gs.OAI
>>> Resource URL in the oai.config file? (could you verify that this works
>>> (i.e. that you would not get duplicate Resource Identifiers through
>>> OAI), even if the dc.Resource Identifier field has more than one
>>> occurrence, e.g. a fixed url and an ISBN)?.
>>
>> The confusion may be due to my having overstressed the application of
>> the "gs.OAI Resource URL" field in my previous message. To clarify:
>> setting this field only overrides the URL automatically generated by
>> Greenstone (either to the original doc, if any, or else to the GS
>> generated html for it). However, gs.OAI Resource URL does not override
>> any dc.Resource Identifier values specified, but becomes a "Resource
>> Identifier" field shown by the OAI server *in addition to* assigned
>> dc.Resource Identifier values.
>>
>> Refer to the 3rd point in my last email on this, repeated below:
>> - if you have one or more manually assigned values in the "dc.Resource
>> Identifier" field, but none in "gs.OAI Resource URL", then the
>> automatically generated Greenstone metadata is not discarded, since this
>> is a URL to the original document (or it is the URL to the
>> Greenstone-generated HTML version of the document, if there was no
>> original document to link to).
>>
>> I just tried out a "pure DC collection" with a single PDF document for
>> which I assigned all dc.* fields, and 2 for dc.Resource Identifier: a
>> URL to google and a random ISBN number. Upon building this collection
>> and viewing its OAI record, the Resource Identifier fields shown for the
>> PDF document were the google URL and ISBN number that I assigned in the
>> 2 dc.Resource Identifier fields and a third one: the Greenstone
>> generated link to the original PDF document.
>>
>> Therefore, the CDS/ISIS documentation should also work as before,
>> requiring the usual mapping from the exploded metadata field field
>> (exp.xxx) containing the document URL to dc.Identifier (for dc.Resource
>> Identifier) and not any mapping to gs.OAIResourceURL (gs.OAI Resource
>> URL).
>>
>> Mapping dc.Resource Identifier to gs.OAI Resource URL will not work
>> anyway, as oaimappings are defined to map from any metadata set defined
>> in Greenstone (such as dls.*, gs.*, exp.*) to one of a specific set of
>> metadata-sets (oai_dc, gsdl_qdc, rfc1807), as explained in the comments
>> in oai.cfg. So it doesn't allow oaimappings from dc.* to gs.*.
>>
>> This is also not necessary in the cases you brought up for
>> consideration, since dc.Resource Identifiers are preserved by Greenstone
>> and not overridden by gs.OAI Resource URL (it coexists beside any
>> assigned dc.Resource Identifiers, overriding only any Greenstone's
>> automatically generated URL to the document).
>>
>> I hope the above is not too confusing and answers your questions.
>> Regards,
>> Anupama
>>
>>
>>
>>
>>
>