[greenstone-users] RE: 1. PDFPlug (Yachnes, Paul)

From Tran
DateWed, 20 Apr 2005 09:08:39 +0700
Subject [greenstone-users] RE: 1. PDFPlug (Yachnes, Paul)
In-Reply-To (ISP-GOYcr62voXNztpN0001dc02-isp-go-FPT-NET)
Hi,

I got the similar problem a month ago when I started to use GS. GS hang at
different points: 15%, 38%...

The following are just my thought and my experiences: There is no a definite
answer to your problem because of at least two reasons. First of all,
PDFPlug is using a third-party program (pdftohtml) for file processing.
Secondly, PDF files might come from many kind of sources and versions.

In my case, three kinds of PDF files were problematic:
- Scan images PDF files
- PDF files with security options
- big PDF files (>5MB)

With first two kinds of PDF, if you have Adobe Acrobat (not Reader but
Editor) and you don't have problem with copyright then you can do 2 things:
OCR them first and eliminate security options.

The third case relates much more to the hardware. I'm testing GS on two PCs,
one with P4-1.4GHz, 378 MB RAM and second one with P-4 3GHzHT, 512 MB DDRAM.
The second PC build 4-5 times faster.

My suggestions are:
1. Identify the file where GS hang
2. Check this PDF file and process it to eliminate problems (OCR it,
eliminate security options...)
3. Using command-line to import and build your collection. This
suggestion is not my but GS developers' one and actually helps to cut off
building time very well.

Regards,

----------------------------------------------------------------------

Message: 1
Date: Tue, 19 Apr 2005 13:25:52 -0400
From: "Yachnes, Paul" <paul.yachnes@naa.org>
Subject: [greenstone-users] PDFPlug
To: "'greenstone-users@list.scms.waikato.ac.nz'"
<greenstone-users@list.scms.waikato.ac.nz>
Message-ID: <D78253F54207174DBCCA4C64533CC4400671B014@webmail.naa.org>
Content-Type: text/plain; charset="us-ascii"

I am having a problem with the PDFPlug. I have selected the "complex" option
but when I attempt to build my collection, after completing the import, it
hangs at 65% of the build and I have to abort. I installed GhostScript and
included the folder in which it is installed in my Path. Any suggestions?

Paul A. Yachnes, MLS
Senior Manager
Information Resource Center
Newspaper Association of America
(703) 902-1694
fax: (703) 902-1691
yachp@naa.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
https://list.scms.waikato.ac.nz/mailman/private/greenstone-users/attachments
/20050419/033aebb1/attachment-0001.htm

------------------------------

Message: 2
Date: Tue, 19 Apr 2005 09:32:26 -0700
From: Jenn Cole <library@ubcic.bc.ca>
Subject: Re: [greenstone-users] Re: Display issue in browse
classifiers
To: Katherine Don <kjdon@cs.waikato.ac.nz>
Cc: greenstone-users <greenstone-users@list.scms.waikato.ac.nz>
Message-ID: <4265329A.4030307@ubcic.bc.ca>
Content-Type: text/plain; charset="us-ascii"

Skipped content of type multipart/alternative-------------- next part
-------------- An HTML attachment was scrubbed...
URL:
https://list.scms.waikato.ac.nz/mailman/private/greenstone-users/attachments
/20050419/fc4f6e4e/QCedithtml.html

------------------------------

_______________________________________________
greenstone-users mailing list
greenstone-users@list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users


End of greenstone-users Digest, Vol 25, Issue 14
************************************************

--
No virus found in this incoming message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.16 - Release Date: 4/18/2005


--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.18 - Release Date: 4/19/2005