|Date||Mon, 22 Sep 2003 12:40:02 +1200|
|Subject||Re: [greenstone-devel] Images not imported in collection constructed with GLI|
Greenstone relies on the external program "wv" (http://wvware.sourceforge.net) to convert Word documents into HTML for indexing. It usually does a pretty good job, but occasionally has problems.
These are unlikely, but you should check that:
- You have "plugin WordPlug" in your collect.cfg file (I'm pretty sure
Near the start of the year, Stefan Boddie had this to say regarding problems importing Word documents:
You might notice you get some broken images appearing in the html output Greenstone produces after converting and importing your MS-Word file. This is most likely because the image appearing within the Word file itself is a WMF image. Greenstone doesn't include support for extracting these images so they end up broken. Since a popular way to use Greenstone is to use the extracted text for indexing but retain the source Word document for display I haven't considered this a huge problem. The wvWare converter itself is however quite capable of extracting these images. To do so requires libwmf and various other components, the inclusion of which would make Greenstone even bigger and slower to download than it is already. Those requiring this feature can download the required components themselves however. The latest versions of everything required can be found at http://www.wvware.com. You simply need to install the new binary files into your gsdlbinwindows (or gsdl/bin/linux or whatever) directory, replacing the wvWare binary that's already there if required. For Windows users there are pre-compiled binaries of wvWare.exe, libwmf, and everything else you need at http://sourceforge.net/projects/gnuwin32.
Hope this helps,
Mauricio Garcia wrote:
Hi: We are constructing a collection from word documents as source