[greenstone-users] Creating DL from existing MS Access Database - Newbie questions

From Michael Silver
DateThu, 10 Aug 2006 12:33:51 -0600
Subject [greenstone-users] Creating DL from existing MS Access Database - Newbie questions

This is a very newbie-esque cry for help. Any pointers or explanations are
welcome and appreciated.

First, let me say that I have read most of the Witten and Bainbridge book on
"How to build a digital library" but my situation doesn't settle itself well
into any of the scenarios they present. I am mainly a networking guy that
pokes around at things until they work, so I can't say I'm an expert at any
of the technologies involved (xml, Access, Refer, etc.)

What I have is a collection of records in a relational database. The
database is natively stored in Paradox format in a fairly old and less than
wonderful application custom developed for the library. I usually produce a
catalogue by importing those records into MS Access, and then using Access
to format and print.

What I want is for each record in the database to become a unique record in
Greenstone. The metadata is all in the database, but I'm not sure what the
best way to get it into GSDL is going to be.

Currently, each item record contains a unique CatID that acts as a primary
key. The remainder of the record includes links to various authority-type
tables (series, producer, distributer) as well as information fields stored
entirely within the record (release dates, annotations, etc). The big catch
is a separate table that associates CatID with SubjectIDs (SubjectID links
to another authority-type table with the text of the subject). Each CatID
can appear multiple times for each of the subject headings the title should
appear under.

>From what I've read, it looks like the best approach is going to be to use
MS Access to produce a report that exports the database in another format
(XML, Refer, DublinCore, etc) and then use the appropriate plugin to import
the information into GSDL.

Given that I need multiple Subject entries, I'm confused as to which format
would be the best given my limited experience with transfering records in
this manner. It might be that MARC would be the best format if I knew it
well enough to handle the tags, indicators, subfields, etc, but I don't
think I'm up to trying to produce the properly formatted file. Refer looks
pretty straightforward, but it doesn't appear to take multiple subjects. And
so my confusion grows...

One of the questions I have is how exact GSDL is in following format
guidelines. Would it accepts multiple instances of a subject in Refer, or
would it recognize it as an unaccepted practice and produce unpredictable
results? Or, for that matter, predictable but unwanted results? :-)

The end goal is to put this up as a searchable resource on the web for a
user base, as well as providing CDs with installable copies. At bare
minimum, we want it searchable/browsable by title, series, and subject. If
it's not beyond me, I would also like full-text searching of annotations,
and advanced fielded searching including release date and grade level.

Suggestions are greatly appreciated, either on list or directly to me at

Thank you very much,

Michael Silver, Network Administrator
Parkland Regional Library
5404 56 Avenue Lacombe, AB T4L 1G1
Phone: 403.782.3850 Fax: 403.782.4650