Re: [greenstone-users] indexers MG and MGPP

From Katherine Don
DateFri, 04 Feb 2005 15:08:46 +1300
Subject Re: [greenstone-users] indexers MG and MGPP
In-Reply-To (Pine-SGI-4-10-10502031656120-715324-100000-alexia-lis-uiuc-edu)
Hi Karen

>
>>a *, like libr*. (Note this is not available with MG).
>
> I'm slightly confused. I thought MG did truncation too.
>
MG supports stemming, but not word truncation (they behave slightly
differently).

> 2nd question: Is Lucene being added so that there can be incremental adds
> to the index?


Yes, eventually. To start with, we just want to get it working. And then
we'll look at making the building incremental.


> And I'm kind of curious how Lucene does this. Partly because
> I was thinking that a Greenstone collection could effect an incremental
> add if the new documents were indexed separately and then the retrieval
> system was told to look in both indexes, kind of like cross-collection
> searches, just with the second collection being the new documents.

One suggestion we make for people who have large and growing
collections is to split them into two. Have the main collection which is
static, and a small collection which new documents get added into. This
small collection must be rebuilt each time new documents are added. So
its not really incremental. But would be faster than rebuilding the
entire collection.
I don't think you would want to create a new index every time you added
a document - the number of indexes could get quite large.

Regards,
Katherine