I'm a relatively new Greenstone User from Kenya.
In the short time I've used Greenstone; (around three weeks) I
have been very impressed by the indexing and classification that
At the outset of the Project I am undertaking, I had a few hiccups
with the Procite plugin but I solved that after exporting the Procite
Data to Comma Separated Values. The CSV plugin comfortably processed
the more than five thousand records and the classifiers worked as
All was well till I came started dealing with Hierarchical
classification, not from a hfile but from structured metadata.
The separator regular expression worked like a charm in extracting
individual values from the metadata.
However, I have tried to no avail, to modify the classifier to
detect the hierarchical nature of the metadata itself based on the type
and sequence of separators.
Though I haven't given up, I would appreciate any comments
regarding this problem. Maybe I have overlooked something really basic.
I have used regular expressions before and got the idea of the
separator structure immediately.
Separator regular expression
Striga/Stem borers/Maize/Biology, Entomology
In short this means that the record under discussion can fall
under Entomology or under Biology/Maize/Stem Borers/Striga. The comma
should evaluate to a top level separator while the forward slash should
act as a hierarchical classifier within the groups extracted from the
comma separated values.
My problem is how to instruct the Hierarchical classifier to treat
the comma as a higher precedence separator than a forward slash.
In this case Entomology and Striga should be hierarchically
superior to Stem Borers.
Presently, the classifier considers the hierarchy to be structured
in order of precedence from Striga to Entomology which is not correct.
Clicking on Entomology should return a list of articles with Entomology
as a Keyword at the top level and other levels falling under Entomology
in other records.
My suspicion is that a change in the structure of the regular
expression should do the trick and this is the path I'm currently
pursuing. I think my next step will be to try and nest the regular
Could you be of help? A pointer in the right direction would be