Re: [greenstone-users] RE:Greenstone 2.5: Images are not seen after importing from MS Word files

From John R. McPherson
DateTue, 06 Jul 2004 11:40:22 +1200
Subject Re: [greenstone-users] RE:Greenstone 2.5: Images are not seen after importing from MS Word files
In-Reply-To (BAY15-F37EQJi9AJpn800026682-hotmail-com)
Raitis Brodezhonok wrote:
> Hello Greenstone users and designers!
>
> This might be the useful information for those who make collection from
> MS Word doc. files with images inside and maybe not only in this case.
>
> I tried Greenstone 2.51 under OS MS windows installed in directory
> C:Program Filesgsdl.
>
> When collection was made I could not see images there.
>
> 1) if in source doc. there was hyperlink to an image file, then link
> was not accessible;

Hi, it depends on the image format and how it is stored. We use a 3rd
party program called "wvWare" to convert MS Word files into .html. It
can extract some images but not others.

> 2) if in source doc. there was inserted picture from a file, then during
> importing in Log file was observed double paths like C:Program
> FilesgsdlcollectdocfilestmpC:Program
> Filesgsdlcollectdocfiles mpD10.jpg and as result images were not
> accessible from collection.
>
> When I installed Greenstone into directory like C:gsdl , then the
> second problem mentioned above
> disappeared (the 1st one still stays).
>
> It looks the problem is about Paths with space inside...(C:Program
> Files) !!

Thanks for reporting this... looking at the code it looks like it
checks that links consist of numbers, letters, "."s and "/".
Could you look in your collection's etc/fail.log - I think it prints
out "HTMLPlug: ERROR - badly formatted tag ignored" for any link that
doesn't match that rule.

According to the standards (eg http://www.ietf.org/rfc/rfc2396.txt),
spaces aren't allowed in URIs/URLs (they should be escaped as %20), so
we need to track down whether greenstone or wvWare is at fault here, but
either way this will definitely be fixed for the next release of greenstone.

Thanks
John McPherson

greenstone or wvWare