Re: [greenstone-users] Collector PDF Encoding Error

From Katherine Don
DateFri, 25 Jun 2004 12:28:26 +1200
Subject Re: [greenstone-users] Collector PDF Encoding Error
In-Reply-To (009d01c457fa$ac1081e0$6112010a-danpc)
Hi

It sounds like you have specified an empty directory in your source files.

I have tried using the Collector to build a collection of PDF files, and
it worked fine.
The warnings about PDF files not being encoded in utf-8 do not affect
the collector. If it can find the files but can't process them, the
build completes fine, and the build log would contain messages like the
following:

Build summary for collector2 collection

* 5 documents were considered for processing
* 0 were processed and included in the collection
* 5 were unrecognised
See /research/kjdon/home/gsdl/collect/collecv1/etc/fail.log for a list
of unrecognised and/or rejected documents
Fail log for collector2 collection

beatles_georgeob.pdf: no plugin could recognise this file
beatles-1.pdf: no plugin could recognise this file
beatles_tab.pdf: no plugin could recognise this file
beatles.pdf: no plugin could recognise this file
beatles_review.pdf: no plugin could recognise this file

So please check your source URLs. Please note that you can only retrieve
files using file:// from the computer that Greenstone is installed on.

Regards,
Katherine Don


Pesserl Dan wrote:
> Hey Guys,
>
> I've beeen trying to create a collection using the collector but keep
> getting this message:
>
> The collection could not be built as it contains no data. Make sure
> that at least one of the directories or files you specified on the
> /source data/ page exists and is of a type or (in the case of a
> directory) contains files of a type, that Greenstone can process.
>
> When I try to create the collection through the command line I am able
> to, but get encoding warnings saying that the PDF files I'm using are
> not encoded in UTF-8, however they are fine to be processed.
>
> I've been trying to find out how I can get things to work through the
> Collector by using switches for the plugins but there's really nothing
> of help out there.
>
> The directory has PDF files in it and I can GSDL can read it without a
> problem since I can see the files being read by the status page.
>
> I have to get the Collector working to let my clients update collections.
>
> Any ideas? Thanks!
>
> -Dan
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> greenstone-users mailing list
> greenstone-users@list.scms.waikato.ac.nz
> https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users