Re: [greenstone-devel] How to use rtftohtml program ?

From John R. McPherson
DateFri, 18 Mar 2005 10:35:03 +1300
Subject Re: [greenstone-devel] How to use rtftohtml program ?
In-Reply-To (c98955b90503161958c50e55b-mail-gmail-com)
On Wed, 2005-03-16 at 19:58 -0800, Thanh Quy wrote:
> Hi ,
>
> I want to convert Word document to HTML.So I use rtftohtml program in
> package folder of Greenstone.Can you guide me to compile it . I have
> no document to do that.
>
> I get file rtftohtml.exe and cygwin1.dll in bin folder to convert
> file,but it's not ok.It just make an empty html file .Can you help me
> ?
>
> In Greenstone , when an user input word documents, you make functions
> to call execute file to convert them to html .Is it rignt ?

We use the program 'wvWare' to convert Microsoft Word documents into
html. The 'rtftohtml' program is for pure rtf files (although
occasionally an rtf file will have the extension .DOC). The rtftohtml
converter cannot understand all rtf files, only some of them. The most
probable reason to not get text extracted from an rtf file is because it
uses an encoding not supported by rtftohtml. The other main reason is
because the rtf file contains undocumented rtf codes that are not in the
specification, so the rtf converter gets confused. (Microsoft tools for
generating rtf files often use undocumented codes).

John McPherson