I've tested your word files on Ubunti Fiesty (we just happened to have
it floating around the office) with the linux (gsdl-2.73-unix.tar.gz)
version of Greenstone downloaded from http://www.greenstone.org/download.
When importing word documents, WordPlug uses ConvertToPlug, which uses
gsConvert.pl, which calls wvWare, to convert word documents to html.
Interestingly I found that wvWare (version 0.7.4) appears to behave
differently on Ubuntu 6.06 (dapper I believe) to Ubuntu Fiesty. It
successfully converts your particular word file on dapper, but produces
strings of ?????????????? instead of words, on Fiesty. The wvWare I used
was exactly the same binary (1152016 bytes) on both machines and the
version of greenstone was from the gsdl-2.73-unix.tar.gz file downloaded
from www.greenstone.org, and installed using the java installshield
installer. I checked the intermediary html files generated by wvWare to
establish that wvWare was the cause of the problem.
It is possible to fix the problem by installing a newer version of vw
(version 1.2.4) and changing <gsdl>/bin/script/gsConvert.pl (line 497)
to use the newer version. However, it is unclear which set of word
documents this version of vwWare will work with, I would hope it would
be a larger set than the previous version.
Greenstone Digital Library and Digitisation Specialists
Julian Fox wrote:
> I need help please, if someone can come to my rescue.
> Using GS 2.73 on an Ubuntu Feisty with LAMP server installation.
> I do not seem to be succeeding in getting GS to convert a number of
> Word files. The language of the content is Italian, hence has
> accented letters. The plugin in various positions only seems to
> produce ?????? for content in the final html conversion. Why? I
> presumed UTF8 would succeed, but I am letting the Word plug select its
> encoding automatically.
> I had these files in 2.72 (before that server crashed) and did not
> have the same problem. I thought it might have been the trasnfer
> process, but I tried a small new collection with a couple of original
> files and I'm getting no good result. Could someone lead me to some
> possible causes and solutions please? I have a rather large collection
> of several thousand items, and obviously wish to resolve it as soon as
> I can.
> greenstone-users mailing list