Re: mgpp collection in v2.39

From Katherine Don
DateThu, 27 Mar 2003 09:04:18 +1200
Subject Re: mgpp collection in v2.39
In-Reply-To (1e23a66180129f90b7e8e27107705095-www2-mail-post-cz)
hi Roman

I have just built a multilingual collection using mgpp in gs2.39-
I used Japanese folktale documents that are in french, japanese,
chinese etc. I tried both linux and windows 2000, and both of these
worked fine. the advanced form search worked ok, and I could use NEAR in
the simple query box, even with Japanese characters.

you are right that Paragraph doesn't work - I guess we have never tried
to use this. You will be able to use Section level but not Paragraph.

I cant help you with the other problems - hopefully someone else will
respond to those.


Roman Chyla wrote:

> Dear list,
> I have big troubles building mgpp collection under Windows 98.
> Please, could you read & help? Below you will find some
> description - but the most critical question for me is is this:
> Does anyone have a valid MGPP collection, with non-ascii text in
> it, with proximity searching and form searching (both simple and
> advanced); and *what platform* does he/she use to build this
> collection?
> here is a list of some troubles (be sure that I followed "sing
> MGPP" manual to set collect.cfg )
> -setup.bat made computer jump out of memory (128MB) - it was
> repaired (maybe wrong, first part for Win95 - something with goto
> command) - I can not use levels Paragraph (if I switch this off
> then the collection is built) -text searching/simple is Okay -
> even for very national characters such as ????????? -but from
> this mode I can not use operator NEAR - I tried advanced mode too
> (near not supported yet?) -form searching/simple - only searching
> with short wovels is possible - I may search '?ena' but not
> '?ensk?' --> it results as '0 counts for word ?ensk' -form
> searching/advanced - this is not responding at all
> -output from PDFPlug mangled - but if I do it manualy then text
> is extracted right(-enc utf8) - are you sure that output from
> pdftohtml is read using utf8? - I was succesfull only when
> collect.cfg contained plugin HTMLPlug -input_encoding utf8
> (with -input_encoding windows_1250 long wovels are missed)
> Thank you for getting here. I have installation disks of Mandrake
> or I may use Windows XP; will this help me to create
> full-mgpp-searchable and windows-exportable collection quickly?
> Roman
> --
> Ziskejte kvalitu, kterou si zasluhujete. Za minimalni mesicni
> poplatek vam nabizime Antivir, Antispam nebo dalsi kapacitu pro
> vas Mailbox. Vice na: