The process that's taking 2 days (circa) is the import process. I ran
the buildcol.pl script this morning and it took less than an hour to
complete. For the import I have 40200+ jpegs with no text. I've created
a series of .item files for the PagedImagePlugin which contains two
lines of metadata used to organize the images. The PagedImagePlugin does
create a smaller image for the web pages though. Other than that, I'm
not doing anything complicated.
May I ask you another question? In my .item files that are being
processed, the metadata is indicated as so:
<Title>La giovent□ perduta - Rifacimento definitivo del testo (C/cassetto)
<Subject and Keywords>Narrativa e diari di viaggio|La giovent□ perduta -
Rifacimento definitivo del testo (C/cassetto)
My problem is that where there is something like '(C/cassetto)' the text
is being truncated and becomes '(C'. Is there an escape sequence to
allow the use of special characters (and accented ones also)?
Thanks for the time you've dedicated me, I am very grateful.
Rag. William Mann
Comune di Belluno
Servizio Sistemi Informativi
Piazza Castello, 14
Il 14/10/2010 15:41, Diego Spano ha scritto:
> The build process cannot start from where it stops because the last run ends
> abnormally. But, can you clarify to me what kind of collection are you
> creating?. It is very strange that the build process takes 2 days... The
> import process is more time consuming, but the build process is faster. What
> kind of images are you managing?. Are there tiff files with ocr?.
> I have a big collection withmore than 700.000 tiffs and each one has a text
> file from ocr. Can□t remember how much time takes the import because I done
> it in many steps, but the build process takes only a few hours (no more than
> 5 o 6 hours).
> If you like, you can send me (off list) your collect.cfg and some sample
> images and I will take a look to it.
> -----Mensaje original-----
> De: William T. Mann [mailto:email@example.com]
> Enviado el: Mi□rcoles, 13 de Octubre de 2010 07:08 p.m.
> Para: Diego Spano
> CC: firstname.lastname@example.org
> Asunto: Re: [greenstone-users] Disk space
> Thanks for the quick reply! This is a big help to setting up my build
> Just one more thing: now that I've gone through the import process and my
> archives directory is populated (and I've freed up the necessary space), is
> possible to start the build process where it left off? That is, can the
> process be started with the creating of the indexes (the building folder)
> without having to go through another 2 days of processing? I'm using the
> PagedImage component (without the cache for space reasons) and as I stated
> before there are over 40200 images!
> Thanks again!