|Date||Tue, 16 Sep 2003 18:01:57 +1200|
|Subject||Re: [greenstone-devel] Using Greenstone for XML Documents|
I'm afraid the answer to both your questions is a "well, not really...". Let me expand on this a bit, in regards to your questions:
1. During the import process, Greenstone converts documents into its internal format: GML (Greenstone Markup Language) - an XML-based, but very simple and static format. If you have installed Greenstone you might like to have a look at one of the "doc.xml" files in the archives folder of the "demo" collection, to see what I mean. The simple structure of these documents means that it could be difficult to map your XML files into this format, and keep the information and structure you need. If your documents are highly structured it is not clear how you would be able to search for data within any particular element, for example. You may be better off looking at a more general XML retrieval system (Lucene is one, I believe).
Our next-generation software, Greenstone 3, changes all this, being completely XML based. In fact, I believe that a TEI demonstration collection has already been built with it. Our first Greenstone 3 release is planned for October 31st.
2. We don't have those facilities at the moment, but we actually have a Masters student currently doing a project on that exact topic! However the project won't finish for a few months yet.
Sorry I couldn't be more helpful. I think in three months time we would have something a lot closer to what you are after, but obviously this doesn't help you much now.
Doug Black wrote: