[greenstone-users] Re: Fwd: Tests with Luigi

From Greenstone Team
DateMon Aug 15 18:29:55 2011
Subject [greenstone-users] Re: Fwd: Tests with Luigi
In-Reply-To (4E44E92F-4030902-free-fr)
Hi,

John Rose wrote:
> Hi Anupama,
>
> I don't remember or know how 2828 was originally assigned. Maybe I
> confused with 8282 which seems to have been assigned by default for
> the Greenstone server. Anyway, all (Greenstone server, preference in
> GLI, OAI server) at 8282, and the OAI server works fine from the
> browser (which it didn't with 2828. So let's just forget 2828.
>
> But the url
> http://localhost:8282/greenstone/cgi-bin/oaiserver.cgi?verb=ListSets
> works in the browser even if in OAI.cfg we have the setting:
> baseServerURL "http://localhost:2828"
> This is confusing - maybe the OAI.cfg setting is useless?
(I'll assume you mean your baseServerURL property refers to 8282 and
that the 2828 above is a typo.)
I do not have this field set in my oai.cfg file, so that it remains
commented out. The note above the baseServerURL property's line in
oai.cfg says that baseServerURL will be automatically generated anyway
if not specified, which is what it did for my case.

>>
>>
>> > Since I did not have anything in gs.OAIResourceURL, a value should
>> have been assigned, no?
>>
>> I may be misunderstanding what you are saying, but it seems to me like
>> you have nothing for the gs.OAI Resource URL metadata field in Enrich
>> Pane? This value is not automatically assigned by Greenstone, the user
>> should type something into the Enrich panel for this field.
>
> I guess that there has been a misunderstanding. I thought that if the
> gs.OAI Resource URL metadata field is blank, then Greenstone would put
> the internal url in this field. What you are saying is that Greenstone
> then puts the internal url somewhere which is not accessible to the
> user, but comes out as oai_dc.Identifier in the OAI record. Could you
> please confirm this? [Logically I would think it would be better for
> GLI to put the internal url into the blank gs.OAI Resource URL after
> the collection is built - but not essential as long as all is clear.
> What you are saying is that Greenstone then puts the internal url
somewhere which is not accessible to the user, but comes out as
oai_dc.Identifier in the OAI record.

Yes. However, I don't know that Greenstone "puts" the internally
generated URL anywhere, but a URL to the original source document is
always automatically generated by Greenstone. This is then used by the
OAIServer part of the Greenstone code to generate the value for the
dc.Identifier field that you see for the OAI record (unless the user
prefers another URL, which they can then provide in the gs.OAI Resource
URL metadata field).

> Logically I would think it would be better for GLI to put the
internal url into the blank gs.OAI Resource URL after the collection is
built - but not essential as long as all is clear.

As far as I know, "gs.*" metadata is Greenstone-specific metadata that
is user assigned metadata (as opposed to, for instance, ex.* metadata
which is Greenstone extracted metadata and which is non-editable). It is
not compulsory for users to assign values to any gs.metadata field, but
in that case it is assumed they want Greenstone to do whatever it does
by default. For other gs.* metadata GLI does not show defaults (if there
are any, often gs.* is optional additional metadata) either.

> This fails, but normal since the validation site surely cannot check a
> local host without a fixed IP number.

> Here's what I have written down as the steps I followed to get my OAI
> Server up
> and running (stop the GS server first):
>
> a) To get GS2 OAIServer to work, edit etc/oai.cfg to provide values for
> the following properties, with sample values filled in below:
> - repositoryName "Greenstone"
> - repositoryId "greenstone"
>
> Also add in any collections to be served over OAI. Add such collections
> by name to the "oaicollection" property. For example, if collection
> "oaipdf" should be served over OAI, then append its name to the property
> oaicollection as follows:
>
> oaicollection demo documented-examples/oai-e oaipdf
>
> b) For each collection meant to be served over OAI, edit the
> collection's etc/collect.cfg by filling in necessary data:
>
> - creator <type email>
> - creator <type email>
>
> c) In your Greenstone Server dialog (the little white one with the Enter
> Library button), go to File > Settings and turn on the Allow External
> Connections option so that the openarchives' validation site can connect
> to it. Don't forget to restart the server before trying to get it
> validated.
>
> Still fails saying "Server at base URL
> 'http://localhost:8282/greenstone/cgi-bin/oaiserver.cgi' failed to
> respond to Identify." But repeating that it works fine with all the
> verbs working locally with the browser. I believe that this validation
> cannot be done without a real IP number?

That's right, your hostname cannot be "localhost" if the site needs to
be accessed from the outside. There's one more step for you to carry out
before attempting to get your OAI Server validated at the openarchives site:
d) In the Greenstone Server dialog's File > Settings dialog, choose
either "Get local IP and resolve to a name" or "Get local IP". Press OK
and restart the server to apply the changes. Now use the new URL (no
longer using "localhost") to fill in the requested URL to the
oaiserver.cgi at openarchives' validator.

>
>
> P.S. Can you tell me why the default Apache port setting in Greenstone
> sometimes is 8282 and sometimes is 8283. This is inconvenient since if
> it changes to 8083 the OAI server fixed at 8082 in oai.cfg no longer
> works?
>
When restarting the server, Greenstone checks if the port is available
before assigning it for use again. If unavailable, it increments the
port number and tries at the new port.
Such a situation can arise if a server restart takes place in such a
manner that the old port is still active for a brief time when the
server stop signal is sent, so that Greenstone resorts to finding
another free port to restart the server on.

The recently updated Greenstone Server Interface code has been adjusted
to work better around this problem, by providing a "do not modify port"
option. But it still can't be guaranteed to always get the server
running on the same port as requested unless there is nothing already
listening there just at the very moment when Greenstone tries for the
same port. In the case of a conflict over chosen port, I think it will
fail to find a working port and you have to restart the server again. I
had not encountered that problem when testing the code changes, but
since a lot of the server (and GLI) interaction is time-dependent, in
that it checks whether the server has started running or has stopped
after some time, this is a conceivable result.


> P.P.S. I really do not understand the rationale for using gs.OAI
> Resource URL instead of dl.Resource Identifier to decide whether to
> generate an internal url. Let's say that someone has their real urls
> in dl.Resource Identifier (normal Dublin Core collection) and
> Greenstone generates a second one because gs.OAI Resource URL is
> blank, very inconvenient. I guess that the user would have to do a
> mapping in oai.cfg from dl.Resource Identifier to gs.OAI Resource URL
> to stop generating the spurious extra url? It would seem much more
> sensible to me to simply put in a parameter in oai.cfg specifying the
> collections for which internal urls should be generated (whether or
> not there is something in dl.Resource Identifier)? Anyway, no reason
> to change now, that all is in order for the next release.

My understanding is that Greenstone was *always* providing a default
(internally-generated) URL to the source document, even before gs.OAI
Resource URL was invented. If that was the case, it would not be because
the user left the gs.OAI Resource URL field blank that Greenstone
generates an URL in addition to anything the user specified in
dc.Resource Identifier (remember that dc.Resource Identifier need not
contain any URLs at all, but can contain other identifiers). I will have
to confirm this supposition with Dr Bainbridge, but did not get the
opportunity today.

gs.OAI Resource URL was devised to solve a particular problem, possibly
upon user request. It does not interfere with general Greenstone
functioning. Until you assign something in it, Greenstone will work as
it ever did before.

I will get back to you on the Linux to Linux OAI server test that you
requested.

Regards,
Anupama