[greenstone-users] Re: One more thing - Re: [greenstone-devel] Problem with GLI, plu gin and non-ascii met adata

From Anupama of Greenstone Team
DateMon Mar 22 20:35:37 2010
Subject [greenstone-users] Re: One more thing - Re: [greenstone-devel] Problem with GLI, plu gin and non-ascii met adata
In-Reply-To (63CB844B15B0D142BF448D6925EFE4AB08EE9A15-mermoz-st-etienne-archi-fr)
Hi,

I talked to my supervisor today about the problem, telling him how the
OGV file arrived unmodified on the server side when the GLI-client
transferred it there (because I could build a new collection on the
server side using the transferred OGV file, and its metadata looked good).

After looking at the situation, he has come to suspect that the issue
may lie with the settings for the Apache web server. Apache runs the
gliserver.pl script, which is the remote greenstone server script that
interfaces with the client-GLI application. And it's the gliserver.pl
script that launches the various perl scripts (such as import.pl and
buildcol.pl) used in the remote case. Gliserver.pl runs them in a way
very similar to how we would launch the scripts from the command-line:
perl -S <scriptname> <arguments>

The present suspicion is that if apache's environment settings are not
in the right encoding for running perl, then the Perl output would not
be correct either. This could explain the disparity between the output
when the scripts are run locally in a terminal where we control the
environment settings (such as running scripts manually or through GLI)
and when scripts are run remotely through GLI-client sending requests to
the gliserver.pl on the Greenstone server end running the Apache web server.

If this is indeed the case, the question is how to find out what the
apache encoding is set to at the moment, and then, how to control such a
setting for our perl scripts (or better: how gliserver.pl can control
the encoding used by the perl scripts that it executes).


> As for my offer to inspect the GLI code, I think I can do that.

Thanks for your offer to help Arnaud. However, considering that the
problem seems to already occur on the server side before anything comes
back to GLI, perhaps there may not be any need to inspect the GLI code
after all.


> I've just tested your modified version and it works perfectly !
> As you noticed I'm not (yet?) a Perl expert. So I'm glad to learn and
it's
> much better this way because it can work on other systems too(ie.
Windows).
> Thank you.

You're very welcome.

Regards,
Anupama.


arnaud yvan wrote:
> Hi Anupama,
>
> I've just tested your modified version and it works perfectly !
> As you noticed I'm not (yet?) a Perl expert. So I'm glad to learn and it's
> much better this way because it can work on other systems too(ie. Windows).
> Thank you.
>
> As for my offer to inspect the GLI code, I think I can do that. I'm also
> interested in a way to link the remote GLI and a LDAP directory for
> authentication but don't expect too much very soon as I dont know too much
> about Java programming !
>
> Thanks again and have a very good week end at the other end of the world.
>
> Best regards,
> Yvan
>
> -----Message d'origine-----
> De : Anupama of Greenstone Team
> [mailto:greenstone_team@cs.waikato.ac.nz]
> Envoy□ : vendredi 19 mars 2010 11:06
> □ : arnaud yvan
> Objet : One more thing - Re: [greenstone-devel] Problem with GLI, plugin
> and non-ascii met adata
>
>
> Hi,
>
> I just sent another e-mail. Please read that too.
>
> In this message, I'm attaching an updated version of your plugin once
> again. This time:
>
> 1. it uses the Perl splits function to split the results of executing
> the mediainfo command over the '|' field. (When I tried the same
> yesterday I forgot to escape the | so it didn't work then, and I had to
> resort to an ugly Regular Expression instead.) The split operation
> returns an array of all the tokenized strings. The code then stores the
> string elements of the array returned, in a list of individually-named
> and ordered variables like artist, duration. These are once again the
> variables you already invented.
>
> 2. I've also made the code use Perl to do the substitions for artist and
> title metadata, instead of launching sed.
>
> The above two changes are only to show you how Perl can be used to
> produce the same results as awk (point 1 above) and sed (point 2). This
> is because Perl is a scripting language too: so it works a lot like bash
> scripting.
>
> Could you try this version of your MediainfoOGVPlugin on your data set,
> and tell me if it fails at any point? I don't want any minor changes I
> made to ruin the functioning of your plugin which you had already tested.
>
> Thanks again for your excellent efforts at writing a plugin. And thanks
> for bringing this remote GLI issue to our notice.
>
> Kind regards,
> Anu
>