Re: [greenstone-users] Searching for chinese characters ("words")

From Katherine Don
DateMon, 26 Jul 2004 09:43:30 +1200
Subject Re: [greenstone-users] Searching for chinese characters ("words")
In-Reply-To (001601c470a8$55a5ea90$7305f6da-suzhou)
Hi Wolfgang

If you are using Greenstone 2.50 or later, you can add an option to the
collection config file (gsdl/collect/<collname>/etc/collect.cfg):
add in the line

separate_cjk true

This will separate Chinese characters by spaces for indexing and
querying. Unfortunately, this option has not made it into the GLI yet,
so if you are using the GLI to build your collection, you will need to
close the collection, add the option in to the config file by hand, then
reopen the collection in GLI. You wont be able to see the option, but it
will be there.

(Note that while the option has cjk in its name, it only works for
chinese characters at the moment.)

Katherine Don

Wolfgang Scheuing wrote:
> Dear all,
> I want to use Greenstone to search for content in my documents, which are in
> german and chinese. I am not very familiar with Greenstone. Searching for
> german content is no problem. To search for chinese content is difficult,
> because in some textes there are no spaces between chinese characters and
> Greenstone takes the whole paragraph as a "word". How can I "force"
> Greenstone to index single chinese characters, even if there is no spacing?
> If there is some information about this topic please give me the link that I
> can do on reading by myself.
> Thanks in advance!
> Wolfgang
> _______________________________________________
> greenstone-users mailing list