Re: [greenstone-users] Object not found!

From Chuck Amadi Systems Administrator
DateThu, 24 Aug 2006 11:53:14 +0100
Subject Re: [greenstone-users] Object not found!
In-Reply-To (1156414909-21765-61-camel-sevenofnine-smtl-co-uk)
Hi Again

I gooogle around and found this about pdftohtml - Greenstone uses an
external program called " pdftohtml " to extract text out of PDF files .
Sometimes , there is no text that can be extracted.

I have downloaded and install pdftohtml .

PDFTOHTML 0.39 for Linux - pdftohtml project is a tool which converts
PDF files into HTML and XML formats.

Run tar command # tar -xzvf pdftohtml-0.39.tar.gz > make > make
DEBUG="-g -DDEBUG_MEM"
Thus installed /usr/local/bin Thus in my system path.

Thus still got the same errors But the pdftohtml.pl resides in
my /local/sw/gsdl/gsdl-2.70/bin/script

So what's happening I will go back to the manual and have another read.

Cheers

On Thu, 2006-08-24 at 11:21 +0100, Chuck Amadi Systems Administrator
wrote:
> Hi Again
>
>
> I had a play and I changing to from Librian Mode to expert mode and
> setting the import option verbosity to 4 and rebuilding.
>
> Here are the interesting log messages:
>
>
> Error executing pdftohtml.pl
> import.pl> pdftohtml error log:
> import.pl> Error: Copying of text from this document is not allowed.
> import.pl> Could not convert
> BS_ISO_15378_2006packagingmaterialsformedicinalproducts.pdf to HTML
> format
> import.pl> Error: Copying of text from this document is not allowed.
> import.pl> WARNING: No plugin could process
> Standards/BS_ISO_15378_2006-packaging-materials-for-medicinal-products.pdf
>
> Thus is there something I require like pdftohtml or this plugin for
> pdf's
>
> Cheers
>
> On Thu, 2006-08-24 at 11:12 +0100, Chuck Amadi Systems Administrator
> wrote:
> > Hi List
> >
> > I have checked my permissions and they are
> >
> > When I re create my SMTL collection I get this after I build it.
> >
> > The Collection has been built and is ready for previewing But when I
> > view the Import Pane I see the following below:
> >
> > ************** Import Started **************
> > The file local/sw/gsdl/gsdl-2.70/collect/smtl/tmp/ASTMD41298a2002e1.html
> > is being processed by HTMLPlug.
> > The file
> > Standards/BS_ISO_15378_2006-packaging-materials-for-medicinal-products.pdf was recognised but could not be processed by any plugin.
> > The file
> > local/sw/gsdl/gsdl-2.70/collect/smtl/tmp/CH_840017_06draftISOstdforinsulinsyringes.html is being processed by HTMLPlug.
> > The file Standards/BS_EN_ISO_21649_2006-needle-free-injectors.pdf was
> > recognised but could not be processed by any plugin.
> > The file Standards/BS_EN_ISO_21647_2004-respiratory-gas-monitors.pdf was
> > recognised but could not be processed by any plugin.
> > The file
> > local/sw/gsdl/gsdl-2.70/collect/smtl/tmp/CH_205_30011_06draften455part3.html is being processed by HTMLPlug.
> > The file Standards/BS_EN_ISO_22870_2006-point-of-care-testing.pdf was
> > recognised but could not be processed by any plugin.
> > The file
> > local/sw/gsdl/gsdl-2.70/collect/smtl/tmp/CH_205_010019_06DRAFT140792gauzetests.html is being processed by HTMLPlug.
> > The file local/sw/gsdl/gsdl-2.70/collect/smtl/tmp/ISO109931.html is
> > being processed by HTMLPlug.
> > The file
> > local/sw/gsdl/gsdl-2.70/collect/smtl/tmp/prEN155461smallboreconnectors.html is being processed by HTMLPlug.
> > The file
> > local/sw/gsdl/gsdl-2.70/collect/smtl/tmp/CH_84__10003_0618thdraftsmallboreconnectorsnonluerJan2006.html is being processed by HTMLPlug.
> > The file
> > local/sw/gsdl/gsdl-2.70/collect/smtl/tmp/CH_2120072_06colourcodingforbloodcontainers.html is being processed by HTMLPlug.
> > The file Standards/BS_EN_ISO_21171_2006-glove-powder-testing.pdf was
> > recognised but could not be processed by any plugin.
> > The file
> > local/sw/gsdl/gsdl-2.70/collect/smtl/tmp/CH_205_30012_06ISOFDIS111932draftformedicalglovesmadefromvinylPVC.html is being processed by HTMLPlug.
> > The file
> > local/sw/gsdl/gsdl-2.70/collect/smtl/tmp/CH_205_010018_06DRAFT140791gauze+cotton.html is being processed by HTMLPlug.
> > ************** Import Finished **************
> > 15 documents were considered for processing:
> > 10 documents were processed and included in the collection.
> > 5 were rejected.
> >
> > ************** Build Started **************
> > Compressing text...
> > Creating an index based on document:text...
> > Creating an index based on document:Title...
> > Creating information database...
> > Creating auxiliary files and tidying up...
> > ************** Build Finished **************
> >
> > Cheers List.
> >
> > Chuck
> >
> > On Thu, 2006-08-24 at 10:53 +0100, Chuck Amadi Systems Administrator
> > wrote:
> > > Hi List
> > >
> > > I have one last problem when I click on my collection called SMTL which
> > > has been built and thus clicked on Preview in the search box enter a
> > > keyword for a pdf files that I know exist I get the following error
> > > message.
> > >
> > > Object not found!
> > > The requested URL was not found on this server. The link on the
> > > referring page seems to be wrong or outdated. Please inform the author
> > > of that page about the error.
> > >
> > > If you think this is a server error, please contact the webmaster.
> > >
> > > The URL:
> > > http://intranet.my.co.uk/collect/smtl/index/assoc/HASH.dir/doc.pdf
> > >
> > > Cheers
> > >
> > >
> > > On Tue, 2006-08-22 at 11:23 +0100, Chuck Amadi Systems Administrator
> > > wrote:
> > > > I am running greenstone 2.7.0 on SuSE SLES 9 it was recognised but could
> > > > not be processed by any plugin this appears in the left hand side pane
> > > > when It's Building the collection.
> > > >
> > > > Also when I click on a pdf via the Browser I get the following error.
> > > >
> > > > Object not found!
> > > > The requested URL was not found on this server. The link on the
> > > > referring page seems to be wrong or outdated. Please inform the author
> > > > of that page about the error.
> > > >
> > > > If you think this is a server error, please contact the webmaster.
> > > > Error 404
> > > >
> > > > Also please can any explain why I dont get any greenstone css or images
> > > > my gsdlsite.cfg file is as below::
> > > >
> > > >
> > > > # this file should be placed in the same directory as your library
> > > > # executable file. it defines parameters that are particular to a
> > > > # given site, and therefore should be edited to suit your site.
> > > >
> > > > # points to the GSDLHOME directory
> > > > #gsdlhome **GSDLHOME**
> > > > gsdlhome /local/sw/gsdl/gsdl-2.70
> > > > #$GSDLHOME=/local/sw/gsdl/gsdl-2.70
> > > >
> > > >
> > > > # this is the http address of GSDLHOME
> > > > # if your webservers DocumentRoot is set to $GSDLHOME
> > > > # then httpprefix can remain commented out
> > > > #httpprefix /gsdl
> > > > #httpprefix /gsdl/gsdl-2.70
> > > >
> > > >
> > > > # this is the http address of the directory which
> > > > # contains the images for the interface.
> > > > # if your webservers DocumentRoot is set to $GSDLHOME
> > > > # then httpimg will be /images
> > > > #httpimg /images
> > > > httpimg /local/sw/gsdl/gsdl-2.70/images
> > > >
> > > > # should contain the http address of this cgi script. This
> > > > # is not needed if the http server sets the environment variable
> > > > # SCRIPT_NAME
> > > > #gwcgi /cgi-bin/library
> > > >
> > > > # maxrequests is the most requests a fastcgi process
> > > > # will serve before it exits. This can be set to a
> > > > # low figure (like 1) while debugging and then set
> > > > # to a high figure (like 10000) when everything is
> > > > # working well.
> > > > maxrequests 10000
> > > >
> > > > Cheers
> > > >
--
Unix/ Linux Systems Administrator
Chuck Amadi
The Surgical Material Testing Laboratory (SMTL),
Princess of Wales Hospital
Coity Road
Bridgend,
United Kingdom, CF31 1RQ.
Email chuck.smtl.co.uk
Tel: +44 1656 752820
Fax: +44 1656 752830