Re: [greenstone-devel] doc.xml plugin errors

From Michael Dewsnip
DateFri, 16 Jan 2004 10:15:26 +1300
Subject Re: [greenstone-devel] doc.xml plugin errors
In-Reply-To (20040115210230-GA2513-mercycorps-org)
Hi Doug,

Yes, the error message isn't very helpful in determining which files are bad, is
it?

Attached is the patch (zipped) - just unzip it and replace your existing
bin/script/pdftohtml.pl with the new one, then re-import and re-build your
collection.

Hope this works,

Michael

Doug Carter wrote:

> Michael,
>
> Yes I am building collections with PDF files, but I have no idea if
> these strange characters are causing the problem, because I'm not
> sure which files they are referring to.
>
> So yes, could you please send me the patch?
>
> Best,
>
> Doug
>
> On Fri, Jan 16, 2004 at 09:35:19AM +1300, Michael Dewsnip wrote:
> > Hi Doug,
> >
> > Are you by any chance building collections containing PDF files? There is a
> > problem with the handling of certain PDF files which causes two strange
> > characters to be added into the Title metadata extracted from these files.
> > This means the doc.xml files fail to parse correctly, and during building you
> > get the error you report.
> >
> > There is a simple patch to this problem, so if this sounds like you, let me
> > know and I'll send it to you and the list.
> >
> > All the best,
> >
> > Michael
> >
> >
> >
> > Doug Carter wrote:
> >
> > > Hi all,
> > >
> > > Since moving to 2.41, I'm seeing an error that I've not seen before in
> > > the .../etc/fail.log:
> > >
> > > doc.xml: no plugin could process this file
> > >
> > > In the past, I only saw recognizable file names, with "failed to convert"
> > > messages. I have multiple collections, and each has from 5-20 of these
> > > lines in their fail.log. I don't know where this message is coming from,
> > > if it's a real problem, or how I can get more information about the error.
> > >
> > > Any ideas?
> > >
> > > TIA,
> > >
> > > Doug Carter
> > > Mercy Corps
> > >
> > > _______________________________________________
> > > greenstone-devel mailing list
> > > greenstone-devel@list.scms.waikato.ac.nz
> > > https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-devel
> >


<<attachment>>
Type: application/x-zip-compressed
Filename: pdftohtml.pl.zip

download