[greenstone-users] Project help requested

From Michael Silver
DateThu, 17 Aug 2006 17:25:41 -0600
Subject [greenstone-users] Project help requested

I emailed once before with some newbie questions, and I'm still wearing the
newbie dunce hat. I usually like exploring things on my own, even if it
takes me a little longer because I learn the system better that way.

Unfortunately, I've got a deadline for this project that is looming, and I'm
very concerned that I'm not going to make it. I severely underestimated how
long it would take for me to do this, and am now looking at needing to
finish in 2 weeks. As in August 31. This deadline is why I'm breaking down
and posting this plea for help.

My apologies in advance to anyone offended or bothered by this post. I do
know that it is better to post specific problems with details than a cry for
help like this, but I'm at the point where I'm not sure I'm choosing the
correct path from square one in terms of data format.

I'm looking for some help. I would be happy to accept help from the group at
large, but I don't want to burden folks with the inevitable series of
questions to the list. (I have read the tutorials, and the book, but I
haven't had as much time working with the actual software.) If sending to
the group isn't a problem, I can do that, but I was thinking I might be able
to get a person that would be willing to help me with this project.

I'm also open to the idea of a consultant to help with this, depending on
the cost. If you are such a person or company, please feel free to contact
me to provide an estimate.

Below is the general challenge.

The desired end result doesn't seem that hard. I need a searchable catalog
of items, very similar to a bibliographic collection. In fact, what
convinced me that GSDL was the choice was the ease with which I was able to
build collections from MARC records. The collection has just under 5,000
records in it. Current access by users is a paper catalog with subject and
series indices pointing to an alphabetical title list with associated
information including an annotation, dates, grade level, etc. Hopefully, the
end result will allow full-text searching of annotations, keyword searches
of series and subject, possibly limiting by grade level, as well as browse
searching of series and subjects. If that's biting off more than is
possible, minimum needs are browse searching of subjects, series, and title.
Ultimately, it would be nice to be able to provide advanced options, such as
limiting searches by date of production or date of acquisition, but....

Creating that collection from a MARC file would seem to be simple.
Unfortunately, the records are in a relational database. My database skills
are limited, and I think the big problem here is getting the information out
of the existing database files in a format that can be used by Greenstone.
My attempts either result in a collection with no items, or a collection
with one file. I've tried using CSV and XML formats, but I'm obviously
missing something.

The database files are:

Catalog - contains most of the information, including a primary key of
CatID. Series, Producer, Distributer link to lookup tables, but Series is
the most important of those.

Series, Distributor, and Producer tables each contain a list of ID and
Description pairs. The ID is the primary key, and matches the entries in the
Catalog table.

CatSub - contains a list of CatID and SubID (Subject ID) pairs. There is no
primary key - the table keys off the combination of the two. Each item can
have multiple subjects.

Subjects contains a list of SubID and Description pairs. The SubID is the
primary key, and it links to the SubID in the CatSub table.

The database files are in a Paradox database. I have attempted to get DBPlug
working, but I'm obviously missing something in the database connection

In the past, the paper catalog has been produced by exporting the data from
Paradox using CSV files, and importing into MS Access, and manipulating from

That is where I am at. Any assistance, pointers, estimates, etc. are
appreciated. Unless it's obviously the will of the group, I won't do another
mass posting like this. Replies can be directed to me at msilver@prl.ab.ca.

Thank you for your patience in reading this far, and for any help.


Michael Silver, Network Administrator
Parkland Regional Library
5404 56 Avenue Lacombe, AB T4L 1G1
Phone: 403.782.3850 Fax: 403.782.4650