Re: [greenstone-users] Problems in building subcollections

From schild
DateWed, 08 Jun 2005 09:53:52 +0200
Subject Re: [greenstone-users] Problems in building subcollections
In-Reply-To (42A664BF-1060004-cs-waikato-ac-nz)
Hi Micheal,

thanks for your quick reply. Sorry for not specifying my what building mode I am running for my collection (mg,mgpp,lucene) right away (should have thought of this...) Indeed I am using MGPP for the building, but I am considering changing to Lucene... Anyway I expected the code of mgbuildproc.pm and found the passage which you changed. I replaced this with the corresponding passages in mgppbuildproc.pm and lucenebuildproc.pm and now Greenstone behaves as expected -> Problem solved! :-)

Many thanks,

Axel Schild

Michael Dewsnip schrieb:
Hi Axel,

This is a bug in Greenstone -- only the first metadata value for the
metadata element is retrieved and checked against the subcollection
expression.

I have fixed this, and you can download an updated version of
perllib/mgbuildproc.pm from
http://www.cs.waikato.ac.nz/~mdewsnip/greenstone/temp/mgbuildproc.pm.
Replace your existing file, then rebuild the collection. (This is
assuming you're using MG -- let me know if you're using MGPP.)

The "partition indexes" code is rarely used, so don't be surprised if
you find more bugs!

Thanks for reporting this,

Michael



schild wrote:

  
Hi list,

Got a question concerning the *"partition indexes"* function of
greenstone. I want to build subcollections of documents based on
specific metadata values (for dc.Subject). Now all of my documents
have more than one dc.Subject value assigned to them (that means
multiple metadata values). When I use the partition indexes function
as outlined in the developers guide or the users guide my
subcollections will only contain documents that have the metadata
value specified in the filter entered as their first subject value.
To explain what I mean, here is a short example:

Let's say the collections contains 2 documents with the metadata
assigned in the following order:

    doc1: dc.subject = cars, dc.subject = combustion engine
    doc2: dc.subject = combustion engine, dc.subject = cars

If I include the following line
    subcollection filterCars     "dc.Subject/cars/"

into the collect.cfg, this subcollection only contains doc1, but not
doc2. The same problem arises for the line
        subcollection filterEngine     "dc.Subject/combustion engine/"

for which the subcollection only contains doc2, but not doc1. What I
want is that in both documents appear in both subcollections! My first
though was focused on changing the regular expression in order to
achieve the desired (which I tried but failed). ThenI tried to do
something like
    subcollection filterEngine     "sibling:dc.Subject/combustion engine/"

which did not get me any further. Can one of the developers of
greenstone tell me whether their is a solution to my problem, and if
so, what it is?

Many thanks,

Axel Schild

-- 
----------------------------------------------

Dipl.-Ing. Axel Schild
Automatisierungstechnik und Prozessinformatik
Ruhr-Universitaet Bochum IC-3/140
D-44780 Bochum, Germany

Tel: +49 234 32 25203
Fax: +49 234 32 14101
E-mail: schild@atp.rub.de
 

------------------------------------------------------------------------

_______________________________________________
greenstone-users mailing list
greenstone-users@list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users
 

    

  


-- 
----------------------------------------------

Dipl.-Ing. Axel Schild
Automatisierungstechnik und Prozessinformatik
Ruhr-Universitaet Bochum IC-3/140
D-44780 Bochum, Germany

Tel: +49 234 32 25203
Fax: +49 234 32 14101
E-mail: schild@atp.rub.de