Re: [greenstone-users] Client/Server setup, AZList and Child nodes, Configuring AZList to ignore whole words.

From Katherine Don
DateMon, 21 Mar 2005 09:53:08 +1200
Subject Re: [greenstone-users] Client/Server setup, AZList and Child nodes, Configuring AZList to ignore whole words.
In-Reply-To (s235ac58-046-westyorksfire-gov-uk)
Hi Jonathan

> 1. Is it possible to use Greenstone in a Client/Server set up? ie have
> the GLI running over a network and importing the files onto a different
> machine (the server). I might have read something to this effect for the
> Linux version but is it possible on Windows? The summery of the
> improvements with Version 2.53 refer to a 'more bandwidth friendly' GLI,
> is this referring to running Greenstone over a network?
You can run the GLI as an applet - this enables you to build collections
on a remote server machine. Instructions for setting up the applet can
be found at

> 2. My child nodes are not listing alphabetically even though I have
> configured it to do this in the GLI. Has anyone else experienced this
> problem? It appears in the collect file as listed below:
> classifyAZCompactList -metadata dc.Subject -minnesting 1
> -mincompact 1 -sort dc.Title -mingroup 1 -maxcompact 15
With this classifier, the classifier nodes should be sorted by
dc.Subject, and the child document nodes should be sorted alphabetically
by dc.Title. Have you changed the format statement at all? ie, you are
displaying dc.Title for the node?
By default, if the language of the documents is english, some
preformatting of the metadata is done before using it for sorting. eg,
convert to lowercase, remove 'a', 'an', 'the' from the start, remove any
characters not a-z0-9. (see gsdl/perllib/,
format_metadata_for_sorting() )
If the language is not english, then nothing is done to the metadata.
You can check which language greenstone thinks your documents are by
looking at the extracted metadata after building - ex.Language. If you
want to override this, have a look at the plugin options
-extract_language, -default_language, -input_encoding
Note that if the language is english, the metadata is converted to lower
case, but not otherwise. Capital letters sort before lower case letters.
(eg the following is in sorted order: A, The, a, the.)

> 3. In the AZList I want to ignore the word 'The' but unfortunately this
> includes the start of 'Theatre' which means Theatre is listed with the
> titles starting with the letter A. Is there any way to ignore a whole
> word? I have tried entering a space afterwards but it has no effect.
see above, also try using b - matches word boundary, eg
-removeprefix "(The|the)b"

> Many thanks,
> Jonathan Pattison
> ***********************************************************************
> This e-mail and the information that it contains may be confidential,legally privileged and protected by law.
> Access by the intended recipient only is authorised.
> Any legal liability (in contract or tort or otherwise) arising from any third party acting or refraining from acting on any information contained in this e-mail is hereby excluded.
> If you are not the intended recipient please notify the sender immediately and do not disclose the contents to any other person,use it for any purpose,or store or copy the information in any medium.
> Copyright in this e-mail and attachments created by us belongs to the West Yorkshire Fire and Civil Defence Authority; the author also asserts the right to object to any misuse.
> ***********************************************************************
> _______________________________________________
> greenstone-users mailing list