Re: [greenstone-users] Importing metadata

From Don Gourley
DateMon, 29 Sep 2003 09:27:34 -0400 (EDT)
Subject Re: [greenstone-users] Importing metadata
In-Reply-To (00cd01c38687$eb870ff0$2aac10ac-nemo)
Héctor Aracena M. said:
> ¿It is possible to import 7000 metadata records into a single collection or
> database in Greenstone?

Yes, and we are doing something similar. We have almost 20,000
metadata records in 25 collections, with over 10,000 in a single
large collection. That collection takes a long time to build,
but I don't think that is just a function of the size, but related
to the classifiers that represent the various relationships between
different kinds of records. Also, our metadata records are in
Dublin Core (encoded in HTML per RFC2731) so we had to write our
own plugin to import the records. Depending on how your records
are encoded you may be able to use an existing plugin; this will
work best if each record is in a separate document file.

> What I'm looking for is to have a repository of metadata that can work as an
> independent database

The result will not be, of course, as generally powerful as a
relational database with regard to querying. But if you know
in advance what kinds of searches you want then you should be
able to build appropriate indexes for information retrieval.
Real databases also have the advantage of being easier to update,
whereas with Greenstone you must rebuild the collection anytime
you add, delete or change a record.

> and, if requested, able to link with the records referred
> in 856 fields.

I'm not sure what you mean here, but if you want to put links to
records in 856 fields you can certainly do that based on the
HASH identifier generated by Greenstone when a record (document)
is imported. The only problem is if you ever change that record
then the identifier will change and the 856 link will need to be
changed. There has been some discussion on this list (or, maybe
it was the developer's list?) on how one could fairly easily
modify the BasPlug perl module to use your own never-changing IDs.