Re: [Fwd: Re: [greenstone-users] mixing Greenstone with DSpace, citesser, berry picking, thesaurus, name authority records, collocation]

From Tod Olson
DateMon, 06 Dec 2004 11:54:24 -0600 (CST)
Subject Re: [Fwd: Re: [greenstone-users] mixing Greenstone with DSpace, citesser, berry picking, thesaurus, name authority records, collocation]
In-Reply-To (41B37B41-4010708-cs-waikato-ac-nz)
>>>>> "CYH" == Chi-Yu Huang <chi@cs.waikato.ac.nz> writes:

CYH> Karen E. Medina wrote:

KM> Moving beyond Dublin Core -- Dublin Core is fantastic for
KM> homogenizing collections, and necessary for OAI, but for local
KM> use, Greenstone is flexible enough to create MARC-like records or
KM> even one's own metadata. All these are important. DSpace's
KM> provenence leaves a lot to be desired.
KM>
KM> Adding Annotation -- allowing users to add annotations to the
KM> documents in a Greenstone collection. This will probably be the
KM> simplest.

CYH> Greenstone uses GDBM (Gnu Database manager) to store metadata,
CYH> classifier information, and document structure information. You
CYH> can add new pieces of metadata to it. However, if would be
CYH> deleted whenever you rebuild the collections. So if you want to
CYH> define your own special annotations, the possible soultions would
CYH> be:
CYH> 1. Add annotations in the metadata.xml and put it into the
CYH> import directory.
CYH> 2. Save annotation details as a separate database file and
CYH> save that in the collection folder (say) not in the index folder.

CYH> Doing this would require either the extension of the existing
CYH> action in the GS run time code, or a new action. A new action
CYH> seems the better option at this stage.

Hi, Karen.

Thinking about how this applies to MARC-like structures, and leaving
annotations aside:

Greenstone metadata is effectively limited to field-value pairs, but
MARC-like structures are quite granular and somewhat hierarchical. If
your metadata is recorded in a MARC-like structure, you can often
flatten to field-value pairs for searching purposes with satisfactory
results. So if maintaining the MARC-like structures are a priority,
you could maintain a separate database of record for your metadata,
the automatically generate the metadata.xml file as a batch export.
Or insert the metadata into the GSAF files after the import step.

The advantage to keeping the more granular metadata is that you can
reuse it later for other purposes. Maybe you have an OAI provider
agreement where you support DC for compliance, but certain harvesters
know to ask for your MARCXML or MODS metadata in addition.

So if you record the more granular stuff, you can repurpose it for use
in Greenstone, and have access to the more granular form for other
purposes. The question is whether it's worth the additional overhead.


Tod A. Olson <tod@uchicago.edu> "How do you know I'm mad?" said Alice.
Sr. Programmer / Analyst "If you weren't mad, you wouldn't have
The University of Chicago Library come here," said the Cat.