[greenstone-users] Phind classifier and Arabic phrase

From kamal mustafa
DateThu Apr 29 23:10:16 2010
Subject [greenstone-users] Phind classifier and Arabic phrase
Hi Katherine
Thank you very much,yes I would like to test the new Phind stuff.
I have good news,Arabic greenstone Emeraldview is working now.Arabic PDF files copied from PDF files to Open Office documents is now fully searchable with associate_ext pdf and also solved the problem of linking to pdf only(AFLI Conference proceedings Greenstone CD-ROM which in 75 PAPERS).Arabic MARC files is working now.
Kamal Salih Mustafa Khalafala
University of Khartoum
Institute of Environmental Studies Library

Date: Wed, 28 Apr 2010 12:10:42 +1200
From: kjdon@cs.waikato.ac.nz
To: ksm1960@live.com
CC: greenstone-users@list.scms.waikato.ac.nz; john.rose1@free.fr
Subject: Re: [greenstone-users] Phind classifier and Arabic phrase

Hi Kamal

Yes, I have looked at this and Phind was broken for non-latin

I have almost fixed it I think. I have tried with arabic and its
working although I think there may be problems in the display to do
with which order the words go in when displaying the phrases.

Would you like to test it for us? I'll have to give you a nightly
release which has the new Phind stuff in it, so you'll need to install
another greenstone.

Otherwise you can wait for the next release.

If you would like to test it, please let me know and I'll get you a



kamal mustafa wrote:

Greenstone Team,

The function of search is not working well with the Arabic phrase
using greenstone Phind classifier .It is only browsable when you

a search of English term some Arabic phrases appear and lead to
display the document.The next two parts is the pop up box of "Java

and the second part is the "build_log.".


Java console


Java vendor: Sun Microsystems Inc.

Java version: 1.6.0_16

Phind collection: gedoo

Phind classifier: 1

Phind classifier: 1

Phind phindcgi: http://localhost:1025/gsdl?a=phind&

Phind library:

Phind backdrop: web/images/phindbg1.jpg

Phind backdrop URL: http://localhost:1025/web/images/phindbg1.jpg

Loaded image: http://localhost:1025/web/images/phindbg1.jpg

Phind orientation: vertical

Phind depth: 2

Phind resultorder: L,l,E,e,D,d

link: 20

exps: 18

docs: 16

Phind blocksize: 10

Phind fontsize: 10





buildcol.pl> Extracting vocabulary and statistics

buildcol.pl> Calculating vocabulary

buildcol.pl> Saving vocabulary in F:Program

buildcol.pl> Saving statistics in F:Program

buildcol.pl> Saving text as numbers in F:Program

buildcol.pl> Extracting phrases from processed text (with suffix)

buildcol.pl> suffix: the phrase extraction program

buildcol.pl> Stopwords mode: no phrase may begin or end with a

buildcol.pl> Reading numbers file: F:Program

buildcol.pl> Allocating symbol arrays for 750 symbols

buildcol.pl> Allocating document arrays for 5 documents

buildcol.pl> generating the phrase hierarchy

buildcol.pl> Initialising hashTable: F:Program

buildcol.pl> Initialising list of hashtable entries: F:Program

buildcol.pl> Starting pass 1

buildcol.pl> Starting pass 2

buildcol.pl> Starting pass 3

buildcol.pl> Sorting and renumbering phrases for input to mgpp

buildcol.pl> Translate phrases: suffix-ids become phind-id's

buildcol.pl> Translate phrases.2: no thesaurus data

buildcol.pl> Translate phrases.3: restore vocabulary

buildcol.pl> Creating phrase databases

buildcol.pl> Done: 203 phrases in F:Program

buildcol.pl> Creating word-level search indexes

buildcol.pl> Creating document information databases

buildcol.pl> *** outputting information for classifier: CL3

buildcol.pl> *** outputting information for classifier: oai

buildcol.pl> *** creating auxiliary files

buildcol.pl> ?????? ??????? ???? ???????? .


Also is it possible to display text index so as to pick up a search
term from the full text?

Coul you please help

Best Regards

Kamal Salih Mustafa

University of Khartoum,

Institute of Environmental Studies Library

Your E-mail and More On-the-Go. Get Windows Live Hotmail Free. Sign
up now.

greenstone-users mailing list

Hotmail: Free, trusted and rich email service.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://list.scms.waikato.ac.nz/mailman/private/greenstone-users/attachments/20100429/32a4ee8a/attachment.html