Re: [greenstone-users] Very slow import process using PagedImgPlug

From Katherine Don
DateMon, 12 Mar 2007 15:39:23 +1300
Subject Re: [greenstone-users] Very slow import process using PagedImgPlug
In-Reply-To (5p80tv$10jod-ironport-jus-gov-ar)
Hi Diego

The main change is allowing an XML format as well as the original simple format. I tried commenting out the bit where we test the file to see what format it is, but it didn't make any difference to the time for importing.

I think that probably the difference is due to changes in the import process as a whole, rather than PagedImgPlug changes.
Do you have other collections and do they take much longer to import now too?

I am not sure what changes would make it go so much slower. We have changed the way arguments are parsed, but thats only done once, not for each file. There is an extra metadata_read pass through the documents, but it looks like that was there for 2.60.

So no idea sorry :-(

I'll add it to our list of things to look at, but don't know if/when anyone will get time to look at it. It would be helpful to know if its just a problem with paged img collections, or with all collections.


Diego Spano wrote:

Hi list, I find that import process (i.e. a folder with 400 tif and 400 txt) takes 2 or 3 times more in GS 2.72 than in GS 2.6. It seems like GS2.72 spend more time preprocessing the item file. The machine is the same for both version: Pentium IV, 1 GB RAM, Windows XP. I don´t do any image conversion, just read item file and indexing the text.
Which are the major changes in PagedImgPlug between 2.6 and 2.72?
Diego Spano
Digital Archive Office
Human Rights Secretary
Buenos Aires . Argentina
Tel.: (5411) 5167-6550

_______________________________________________ greenstone-users mailing list