Re: [greenstone-users] Project help requested

From Michael Dewsnip
DateFri, 18 Aug 2006 12:39:05 +1200
Subject Re: [greenstone-users] Project help requested
In-Reply-To (021c01c6c254$74da41e0$5c016c0a-hq-prl-ab-ca)
Hi Michael,

I have to start by saying that I'm jealous of your location: I have many
fond memories of Alberta, Banff, Waterton Lakes, camping in the Rockies,
bears... :-)

Also, you don't need to apologise for sending a more general post to the
list. Although you're probably less likely to get a reply, these type of
posts certainly aren't frowned upon, and in fact they are often very
useful to other people.

I've just been working on a project that is quite similar to what you
are trying to do. The collection is based on a big survey of campus
architecture in the US, and all the information (except for the images)
is stored in a Microsoft Access database, with a dozen or so tables. The
collection in its current (not quite finished) state is available at

Briefly, the collection works like this:

- The Microsoft Access database is set up to be an ODBC source

- The Perl DBI stuff had to be installed (what OS are you using?)

- A new Greenstone plugin was written that uses Perl DBI to query the
database (perhaps DBPlug could be used, but I've never understood how to
use it!)

- Greenstone objects are created for each of these four things:
institutions, places, designers, and references

- The appropriate metadata is retrieved from the database using SQL
statements and added to these objects

- The Greenstone format statements and macrofiles were heavily
customised to display the different objects and link between the pages

- Custom Javascript was written for the Advanced Search pages

Your situation is a bit simpler, since you have fewer tables and fields,
and less scope for customising the way the pages look. However I think
you're still going to have to get dirty in a Greenstone plugin (how is
your knowledge of Perl?).

Before I forget, there is a new CSVPlug for Greenstone
that will process CSV files. I don't think it will let you link things
together as you need, however.

Regarding contracting this to someone else, you might like to get in
touch with DL Consulting Ltd. (, who are
very experienced with Greenstone. I work for them half-time.

You might be in for a busy fortnight :-)

All the best,


Michael Silver wrote:

>I emailed once before with some newbie questions, and I'm still wearing the
>newbie dunce hat. I usually like exploring things on my own, even if it
>takes me a little longer because I learn the system better that way.
>Unfortunately, I've got a deadline for this project that is looming, and I'm
>very concerned that I'm not going to make it. I severely underestimated how
>long it would take for me to do this, and am now looking at needing to
>finish in 2 weeks. As in August 31. This deadline is why I'm breaking down
>and posting this plea for help.
>My apologies in advance to anyone offended or bothered by this post. I do
>know that it is better to post specific problems with details than a cry for
>help like this, but I'm at the point where I'm not sure I'm choosing the
>correct path from square one in terms of data format.
>I'm looking for some help. I would be happy to accept help from the group at
>large, but I don't want to burden folks with the inevitable series of
>questions to the list. (I have read the tutorials, and the book, but I
>haven't had as much time working with the actual software.) If sending to
>the group isn't a problem, I can do that, but I was thinking I might be able
>to get a person that would be willing to help me with this project.
>I'm also open to the idea of a consultant to help with this, depending on
>the cost. If you are such a person or company, please feel free to contact
>me to provide an estimate.
>Below is the general challenge.
>The desired end result doesn't seem that hard. I need a searchable catalog
>of items, very similar to a bibliographic collection. In fact, what
>convinced me that GSDL was the choice was the ease with which I was able to
>build collections from MARC records. The collection has just under 5,000
>records in it. Current access by users is a paper catalog with subject and
>series indices pointing to an alphabetical title list with associated
>information including an annotation, dates, grade level, etc. Hopefully, the
>end result will allow full-text searching of annotations, keyword searches
>of series and subject, possibly limiting by grade level, as well as browse
>searching of series and subjects. If that's biting off more than is
>possible, minimum needs are browse searching of subjects, series, and title.
>Ultimately, it would be nice to be able to provide advanced options, such as
>limiting searches by date of production or date of acquisition, but....
>Creating that collection from a MARC file would seem to be simple.
>Unfortunately, the records are in a relational database. My database skills
>are limited, and I think the big problem here is getting the information out
>of the existing database files in a format that can be used by Greenstone.
>My attempts either result in a collection with no items, or a collection
>with one file. I've tried using CSV and XML formats, but I'm obviously
>missing something.
>The database files are:
>Catalog - contains most of the information, including a primary key of
>CatID. Series, Producer, Distributer link to lookup tables, but Series is
>the most important of those.
>Series, Distributor, and Producer tables each contain a list of ID and
>Description pairs. The ID is the primary key, and matches the entries in the
>Catalog table.
>CatSub - contains a list of CatID and SubID (Subject ID) pairs. There is no
>primary key - the table keys off the combination of the two. Each item can
>have multiple subjects.
>Subjects contains a list of SubID and Description pairs. The SubID is the
>primary key, and it links to the SubID in the CatSub table.
>The database files are in a Paradox database. I have attempted to get DBPlug
>working, but I'm obviously missing something in the database connection
>In the past, the paper catalog has been produced by exporting the data from
>Paradox using CSV files, and importing into MS Access, and manipulating from
>That is where I am at. Any assistance, pointers, estimates, etc. are
>appreciated. Unless it's obviously the will of the group, I won't do another
>mass posting like this. Replies can be directed to me at
>Thank you for your patience in reading this far, and for any help.
>Michael Silver, Network Administrator
>Parkland Regional Library
>5404 56 Avenue Lacombe, AB T4L 1G1
>Phone: 403.782.3850 Fax: 403.782.4650
>greenstone-users mailing list