[greenstone-devel] Using Greenstone for XML Documents

From Doug Black
DateSat, 13 Sep 2003 18:30:37 -0400
Subject [greenstone-devel] Using Greenstone for XML Documents

I've been exploring Greenstone to use with collections of SGML and XML documents. One set is XML TEI and another is an proprietary SGML document type that is generally a book structure. It could reasonably easily be converted to XML. I have basic two needs for which I am first seeking general answers as to whether Greenstone is a feasible tool.

1. First will Greenstone handle XML documents with relative ease? I see there is an auxiliary XML plugin but I haven't been able to understand it yet. Included in this question can Greenstone search for data within any particular element and format any particular element. Also we need to search on a paragraph level which generally seems possible by mapping them to GS <section>s, but do I really have to embed commented out <section> tags throughout the document? Is there a way of mapping a <p> element in the import documents to GS <section> elements more generically?

2. Each of these collections are indexed with terms from a thesaurus using embedded elements in the XML/SGML. Is searching with a thesaurus plausible with Greenstone?

Thanks,

Doug

Doug Black
West Rock Visions
137 Alden Avenue
New Haven, CT 06515
Voice and Fax: (203) 389-0184
doug@westrockvisions.com