[greenstone-users] Re: Report WordPlugin problems of GSDL version 2.62

From Katherine Don
DateWed, 04 Jan 2006 14:56:38 +1300
Subject [greenstone-users] Re: Report WordPlugin problems of GSDL version 2.62
In-Reply-To (000701c60cd9$ee29ef40$9702a8c0-nacestid)
Hi

There was a bug in the plugin where if input_encoding was set, then
HTMLPlug didn't get passed the input_encoding option correctly.

Please find the following code (line 155) in perllib/plugins/WordPlug.pm
# wvWare will always produce html files encoded as utf-8
if ($self->{'input_encoding'} eq "auto") {
$self->{'input_encoding'} = "utf8";
$self->{'extract_language'} = 1;
push(@$html_options,"-input_encoding", "utf8");
push(@$html_options,"-extract_language");
}

Move the line
push(@$html_options,"-input_encoding", "utf8");
to outside the if statement, i.e.

# wvWare will always produce html files encoded as utf-8
if ($self->{'input_encoding'} eq "auto") {
$self->{'input_encoding'} = "utf8";
$self->{'extract_language'} = 1;
push(@$html_options,"-extract_language");
}
push(@$html_options,"-input_encoding", "utf8");

And hopefully this will work. You'll need to reimport and rebuild the
collection. If it doesn't work, please let me know.

Regards,
Katherine

Cao Minh Kiem wrote:
> Dear Michael, Katherine and Greenstone Developer Team,
>
> First of all, I wish you Merry Chrismas, Happy New Year and more success for
> GSDL project.
>
> Meanwhile, I would like to report some problem of Wordplugin of GSDL Version
> 2.62.
>
> I have just installed GSDL software version 2.62 and found that WordPlugin
> seemed not to work properly. It does not convert correctly characters MS
> word .doc file into HTML. In my case, the Word DOC file is in UNICODE.
> WordPlugin of version 2.60 works fine.
> If the DOC file is saved in HTML format, it is OK (because it is processed
> by HTML Plugin.
> Could you give me some tips and advices to solve the problems?.
>
> I send you some word file and its HTML file (created by Word) for testing.
> Thank you for wonderful works.
> Best regards
> Cao Minh Kiem
> Deputy-Director
> National Center for S&T Information
> 24 Ly Thuong Kiet, Hanoi. VIETNAM
> Tel: (84-4)-9349491. Fax (84-4)-9349127
> Email: kiemcm@vista.gov.vn
>
>
>
>