RE: [greenstone-devel] Re: PagedImgPlug

From Emanuel Dejanu
DateThu, 30 Dec 2004 11:34:54 +0200
Subject RE: [greenstone-devel] Re: PagedImgPlug
In-Reply-To (35813418-58B0-11D9-BEF6-000D93B1CDEE-bigpond-com)
Try not to use hash for generating OIDs. From may experience this takes a lot of time.
try -OIDtype incremental
This are all the values that  -OIDtype can get (I use dirname):
hash: Hashes the contents of the file. Document
  identifier will be the same every time the collection
  is imported.
incremental: A simple document count that is
  significantly faster than "hash". It is not
  guaranteed to always assign the same identifier to a
  given document though and does not allow further
  documents to be added to existing xml archives.
assigned: Uses 'D' plus the value of dc.Identifier as
  the document identifier. dc.Identifiers should be
  unique. If no dc.Identifier is assigned to the
  document, a hash id will be used instead.
dirname: Uses 'J' plus the parent directory name as the
  identifier. This relies on there being only one
  document per directory, and all directory names being
  unique. E.g. import/b13as/h15ef/page.html will get an
  identifier of Jh15ef.
Best regards,
Emanuel Dejanu

From: [] On Behalf Of
Sent: Tuesday, December 28, 2004 11:10 AM
Subject: [greenstone-devel] Re: PagedImgPlug

Try looking at the output from the import to see what stages are taking the most time- you may be able to fiddle with PagedImgPlug.

If it is disk accesses slowing you down and assuming you have heaps of ram - you could try doing the imports from a ram disk. - just make small batches and do 2 or three at a time.

Try import by hand to see if that works better - or gives you a better idea of whats going wrong.

Remember to set the -verbosity to 3 for the most feedback.


On 28 Dec 2004, at 5:56 PM, 绮君 wrote:

Hello all,

I'm making a library with PagedImgPlug. The imported
files are
tiff pictures of size about 250kb. My problem is that
the processing
time is too long. A book with five pages may take more
than five
minutes and error message may occur sometimes when I
work on more
files. I wonder if there is anything wrong.
Can anyone give me some advice?
Thank you.


Do You Yahoo!?

greenstone-devel mailing list