|Date||Thu, 16 Mar 2006 15:44:58 +1300|
|Subject||Re: [greenstone-users] encoding again|
I'm not aware of any way of directly looking at what's in the mg(pp) indexes.
Some things to try:
o Perhaps it's a problem matching your inputted text with that in the index when you submit search queries. In all cases when you test searching are your input characters the same encoding as what greenstone expects? (the "w" argument)
o Some versions of Perl sometimes get confused and double encode UTF-8 when the xml parser parses your archives directory during the build phase. If you are running perl 5.8, try 5.6.
Greenstone Digital Library and Digitisation Specialists
jens wille wrote:
hi there! it's time for me to ask for help with some encoding problems again ;-) i'm building a collection using mgpp (v2.62, same with v2.63), the source files are in utf8 and are processed with HTMLPlug. however, i'm unable to search for terms containing umlauts :-( (the archives files are correct utf8, so what goes wrong here has to be during build phase) a bit of examination lead me to the assumption that my metadata are "decoded" (where? to what encoding? why?) and then encoded to utf8 _twice_ (!) - oddly enough, this used to work with mg (though i couldn't find any difference between mg and mgpp in this regard). now i wanted to break my umlauts (□ => ae, ...) which i'm doing for other diacritics (□ => c, ...) all along (using the filter_text function; and which worked and still does - apparently!), but no change for the umlauts: still no results. my question now is (apart from general help regarding this problem) how i could have a look into the mg(pp)-index, to see what mg(pp) actually has in there. there's db2txt for the text db, but this doesn't seem to work for the index db. (besides, the Queryer shows the same behaviour and isn't of much help here - at least not that i know of) again, any help would be greatly appreciated ;-) cheers jens _______________________________________________ greenstone-users mailing list email@example.com https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users