Re: [greenstone-users] encoding again

From jens wille
DateThu, 16 Mar 2006 13:56:41 +0100
Subject Re: [greenstone-users] encoding again
In-Reply-To (4418D12A-7080103-dlconsulting-co-nz)
hi richard!

Richard Managh [16.03.2006 03:44]:
> I'm not aware of any way of directly looking at what's in the
> mg(pp) indexes.
too bad :-(

> o Perhaps it's a problem matching your inputted text with that in
> the index when you submit search queries. In all cases when you
> test searching are your input characters the same encoding as
> what greenstone expects? (the "w" argument)
no, that's not the problem: first, the "w" argument is correct,
second, what i was trying to do now is to replace all umlauts, so
there are no special characters in the input.

> o Some versions of Perl sometimes get confused and double encode
> UTF-8 when the xml parser parses your archives directory during
> the build phase. If you are running perl 5.8, try 5.6.
*lol* sorry, but that's a bit odd a suggestion, isn't it ;-) rather
i'd like to learn where this happens and how i can avoid it.
(btw: why (and in what respect) does mgpp behave here differently
than mg?).

but maybe this really is where the problem originates, so i will try
to elaborate on that (trace relevant subroutines, print out some
variables, ... - it's just pretty time-consuming, so i wanted to ask
here first).

thanks for your suggestions, anyway!