Re: [greenstone-users] Exporting Greenstone metadata

From David Bainbridge
DateThu, 26 Aug 2004 13:50:48 +1200
Subject Re: [greenstone-users] Exporting Greenstone metadata
In-Reply-To (412AC43D-5AFA4F39-cs-waikato-ac-nz)
On Tue, Aug 24, 2004 at 04:29:49PM +1200, Michael Dewsnip wrote:
> Dynal Patel wrote:
>
> > Hello
> >
> > How does Greenstone store its metadata? Is it possible to export the
> > metadata to other formats? How would one go about doing this?
> >
> > thanks
> >
> > Dynal Patel
> >

Dynal,

There are several possible answers to this, depending on what you
are trying to do. Can you please provide a bit more context to
the work.

For example, in the import folder you can have metadata.xml files
that store metadata about individual documents and sets of documents.
GLI makes use of this mechanismn to store the metadata assigned through
it interface.

In the archives folder, both manually assigned metadata and extracted
metadata is encoded in the Greenstone Archive files (along with any
extracted text). This might be another place that would make sense
to access the metadata, particular if you wanted the extracted
metadata too. Recently we've added a 'saveas METS' option to import.pl
so instead of generating our own Greenstone Archive format, it will
generate METS compliant data, which we hope will be more useful to
a wider range of applications. You would need the CVS checked out
version of Greenstone to access this.

Finally, there is the Gnu database used by the runtime system. This
stores all the metadata, plus other things such as Classifier
structure. For a collection called 'mycol' the database file
is mycol/index/text/mycol.ldb on Linux, Windows and other
little-endian machines, and mycol.bdb on big-endian machines
such as Solaris. Having source setup.bash or setup.bat, running
something along the lines of:

db2txt collect/mycol/index/text/mycol.ldb | less

will print the content of the database in a key: value format
to the screen. It might be that you want to write a program
that parses this simplistic format to extract/convert the
metadata you seek.

David.


> > ---------------------------------------
> > African Digital Library Centre
> > Room 300, 3rd Floor
> > Computer Science Building,
> > University of Cape Town
> > Cape Town
> > 7700
> > RSA
> > ---------------------------------------
> > Phone: +27 21 650 2670
> > Fax: +27 21 689 9465
> > Email: dpatel@cs.uct.ac.za
> > Website: http://greenstone.cs.uct.ac.za
> > ---------------------------------------
> >
> > _______________________________________________
> > greenstone-users mailing list
> > greenstone-users@list.scms.waikato.ac.nz
> > https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users
>