Re: [greenstone-users] Length of a classifier list

From Jonathan Gorman
DateMon, 5 Dec 2005 09:00:52 -0600 (Central Standard Time)
Subject Re: [greenstone-users] Length of a classifier list
In-Reply-To (43938FD1-3080705-dlconsulting-co-nz)
> I'm not sure if you have definitions for your glossary terms, whether you
> intend the definitions to be displayed in the classifier or for the
> classifier to link to the definitions.

In an ideal world, both. Imagine a longer, encyclopedia article is
possible for the various cooking terms like the Mallaird reaction
(or even subparts like glycation). The classifier could display something
that would be a "short description" in the classifier list.

> What is the benefit of having a classifier for cooking terms? Would it be
> simpler to have a static page, or perhaps a number of static pages and simply
> link to them?

Well, there are two problems with this. One is that's it is not very
scalable. It might be fine for a small collection, but what if our
collection grows. A quick check of "On Food and Cooking" yields an index
of around 45 pages. Granted some of these are people and locations. But
if I decided to treat ingredients and cooking equipment in a similar
manner I'd be quickly appproaching a large html file to maintain. Also,
I'd need to have another indexer for that or similar hack in Greenstone.
Also, it doesn't give me the ability to link to multiple resources in an
organized manner like the classifiers.

The second problem is one of automation. It also doesn't quite seem to
make sense that if I don't want to have to maintain a controlled hierarchy
that I have to maintain a seperate webpage. I'd prefer just spending the
effort to create a hierarchy.

> If however you do want to go this way, you might find
> parsing the gdbm database for your collection easier than parsing the
> classifier html pages by browsing the collection directly.

I obviously wasn't thinking. For some reason I was thinking that the
classifiers were somehow statically generated, but this makes far more
sense to be using the gdbm files for the classifiers as well as text
searching. I did do a little bit of searching for some perl modules a
little while ago that are similar to the Berkeley ones, but haven't had a
huge amount of luck. I'm sure I'll find one though now that I have more
motivation to look ;).

> Find the *.ldb
> file in your collection's index/text directory, and use the db2txt utility
> provided with greenstone to convert it to text form. You can do something
> like
> db2txt recipes.ldb > recipe_database.txt

Or I suppose I could just process the text output.

It doesn't look like I'll avoid having to build the collection. Ah well,
it's doesn't take too long...yet.

Thanks for the help.

Jon Gorman