[greenstone-devel] Problem with hash and archives.inf

From Diego Spano
DateWed Feb 20 06:32:43 2008
Subject [greenstone-devel] Problem with hash and archives.inf
Hi list,

I can□t understand how GS manage the import process when it finds the same
input filename. Let me explain with an example:

I have doc1.pdf in import folder. Then I run "perl -S import.pl demo" and
after it the archives folder has a subfolder named HASH01d6.dir with 2
files: doc.xml and doc.pdf and the file archives.inf has the following line:


HASH01d6d949f2fdf0194131046a HASH01d6.dir\doc.xml I

If I run again the import process with -keepold option and the same input
file, the contents change as following:

the archives folder has a subfolder named HASH01d6.dir with 2 files: doc.xml
and doc.pdf and inside it there is another folder named .dir with two files
too, doc.xml and doc.pdf. The file archives.inf has the following line:

HASH01d6d949f2fdf0194131046a HASH01d6.dir\.dir\doc.xml I


So, there is no reference to the first imported file. The archives folder
has 2 doc.xml and 2 doc.pdf but only one is referenced in archive.inf. This
behaviour makes me think that every time we have a file in the import folder
that has the same filename as other imported file (no matter when it was
imported), the original file will be lost. It is impossible to unsure that
every imput file will have a unique filename. GS should index both of them,
so both files should be in archives.inf. Am I wrong?. It is a bug?

TIA

Diego Spano


Diego J. Spano
Direcci□n General de Gesti□n Inform□tica
Ministerio de Justicia, Seg. y DD. HH.
Tel.: 4328.3015 (int.1404)
4322.6122 (directo)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://list.scms.waikato.ac.nz/mailman/private/greenstone-devel/attachments/20080219/dd28fff6/attachment.html