[greenstone-users] Plugin order MetadataCSVPlug and Maxdocs

From Stephen De Gabrielle
DateMon, 6 Nov 2006 11:15:38 +0930
Subject [greenstone-users] Plugin order MetadataCSVPlug and Maxdocs

I've just started using MetadataCSVPlug (unmodified) for another
collection, and I've noticed a few things that may cause others angst;

When using MetadataCSVPlug and ImagePlug together;
- I had both my metadata.csv file and my images in the same folder.
- I started with imagePlug before MetadataCSVPlug on the plugins list
- had to move up metadataCSVPlug
- My metadata.csv file was in the same folder - so even though the
MetadataCSVPlug plugin came before imagePlug in the plugins list - the
image files were processed first because the preceded the metadata
file (in the alphabetical filesystem order)
Maxdocs notes
- I had maxdocs set to 10 - I wanted to do quick import/build cycles
while I was finetuning the process.
- maxdocs counts the metadata.csv file as one document - so I ended
processing 1812 metadata records, but only associating 9 image files.
(at least is was quicker than runing imagemagic for all images- not
that I could easily find the 9 i did process)

My thoughts are this
- make sure the filename in your metadata.csv file either
--- refers to a child folder eg images/344.jpg
- or
--- is renamed 0000000metadata.csv so it is done first. (I did this
as it seemed simpler/safer than modifying my metadata

If I want to build a subset - only import a subset. -dont rely on
plugins that process multiple metadata. (I bet this applies to
ReferPlug and others)

- Possible TODO modify MetadataCSVPlug.pm to respect -the maxdocs
flag. (but how will it know which images?)

Those are my notes - please let me know what you think - I will put
this on the wiki in some form so others are not caught out like me
(Building 2000 files - time for a coffee)



Stephen De Gabrielle