[greenstone-users] "Ignore accents" - Strange behaviour

From Katherine Don
DateMon May 12 09:53:37 2008
Subject [greenstone-users] "Ignore accents" - Strange behaviour
In-Reply-To (TRONADORFRaqbC8wSA100000024-mx-orsna-gov-ar)
Hi Diego

Can you check the collection - in the index/build.cfg file, does it have
"stemindexes 7", or some other number?
And in the idx directory, do you get files collname.ib1 through to
collname.ib7 ?
Did you have stemming, casefolding and accent folding all turned on for
the index building?

Regards,
Katherine

Diego Spano wrote:
> Just to add more info that can help to find the problem:
>
> In error.txt file I have the following line:
>
> "Stem index for method 7 was not built, so not doing stemming"
>
> But buildcol shows no errors:
>
> Stats (Creating index text;Numero;Resumen;)
> Total bytes in collection: 9795
> Total bytes in text;Numero;Resumen;: 9268
>
> create the weights file
> .L = 10.190984
> U = 14.823774
> B = 1.001465
> .L = 10.190984
> U = 14.823774
> B = 1.001465
>
> creating 'on-disk' stemmed dictionary
> mgpp_invf_dict.exe : Max word block size = 482
> mgpp_invf_dict.exe : Max tag block size = 0
> mgpp_invf_dict.exe : Number of word blocks written = 19
> mgpp_invf_dict.exe : Number of tag blocks written = 1
>
> creating stem indexes
> mgpp_stem_idx.exe : Num word stems = 254
> mgpp_stem_idx.exe : Max stem block size = 0
> mgpp_stem_idx.exe : Number of stem blocks written = 1
> mgpp_stem_idx.exe : Num word stems = 286
> mgpp_stem_idx.exe : Max stem block size = 0
> mgpp_stem_idx.exe : Number of stem blocks written = 1
> mgpp_stem_idx.exe : Num word stems = 243
> mgpp_stem_idx.exe : Max stem block size = 0
> mgpp_stem_idx.exe : Number of stem blocks written = 1
> deleting resol.ic
> deleting resol.ict
> deleting resol.id
> deleting resol.idh
> deleting resol.ii
> deleting resol.invf.state.2644
> BuildDir: C:/Archivos de programa/Greenstone280/collect/resol/building
>
> *** creating the info database and processing associated files
> ArcPlug: processing C:Archivos de
> programaGreenstone280collectresolarchivesarchives.inf
> ...
>
> TIA
>
> Diego Spano
>
> -----Mensaje original-----
> De: greenstone-users-bounces@list.scms.waikato.ac.nz
> [mailto:greenstone-users-bounces@list.scms.waikato.ac.nz] En nombre de Diego
> Spano
> Enviado el: Jueves, 08 de Mayo de 2008 11:12 a.m.
> Para: 'Greenstone (Users)'; 'Greenstone (Devel)'
> Asunto: [greenstone-users] "Ignore accents" - Strange behaviour
>
>
>
> Hi list, I want to explain an strange behaviour I noticed from preferences.
>
> I have a collection with an index using Subject metadata. One of the
> assigned values is "Informe", so when I run a search using Subject index I
> get the following results:
>
> If I set "ignore case differences", I can search for "informe" or "Informe"
> and I get similar results. But if I set "ignore accents" preference too,
> then I get NO results. I have to turn "accents must match" on to get the
> documents. Why? How does "ignore accents" relates with "ignore case
> differences"?.
>
> "Informe" has no accents, it should be retrieved if I set "ignore case
> differences" and "ignore accents".
>
> I□m using GS 2.80 with mgpp running on Windows XP ann IIS. I almost sure
> that this strange behaviour was nos present in 2.7x versions.
>
> TIA.
>
>
> Diego Spano
>
>
>
>