Re: [greenstone-users] Importing CDS/ISIS db failure ...arcinfo::save_info couldn't write

From ruben pandolfi
DateThu, 16 Feb 2006 08:53:16 +0100
Subject Re: [greenstone-users] Importing CDS/ISIS db failure ...arcinfo::save_info couldn't write
In-Reply-To (43F39C3E-60602-cs-waikato-ac-nz)
Thank you Michael, Thank you guys!

now I can import the DB :-) corretcly, and setting dos-850 does shows
the correct charset, great.

I have imported correclty, and want to explode the .mst to be able to
use gsdl to add/edit metadata, and associate full text docs when
available to the relevant record.

Unfortunately I have the same encoding error, peraphs there is another
fix for this?


import.pl> NULPlug processing
"/var/www/gsdl/collect/babel/import/EM20/0019.nul"
import.pl> NULPlug processing
"/var/www/gsdl/collect/babel/import/EM20/0020.nul"
import.pl> *********************************************
import.pl> Import complete
import.pl> *********************************************
import.pl> * 20 documents were considered for processing
import.pl> * 20 were processed and included in the collection
import.pl> Command complete.
import.pl> Extracting new metadata from archive files.
import.pl> Archived metadata extraction complete.
Command: /var/www/gsdl/bin/script/buildcol.pl -gli -language en
-collectdir /var/www/gsdl/collect/ -removeold babel
buildcol.pl> *** creating the compressed text
buildcol.pl> collecting text statistics
buildcol.pl> ArcPlug: processing
/var/www/gsdl/collect/babel/archives/archives.inf
buildcol.pl> GAPlug: processing HASHedda.dir/doc.xml
buildcol.pl> **** Error is:
buildcol.pl> not well-formed (invalid token) at line 12, column 23, byte
509 at /usr/lib/perl5/XML/Parser.pm line 187
buildcol.pl> WARNING: No plugin could process HASHedda.dir/doc.xml
buildcol.pl> GAPlug: processing HASH01c2.dir/doc.xml
buildcol.pl> **** Error is:


and finally

buildcol.pl> WARNING: No plugin could process HASH73fe.dir/doc.xml
buildcol.pl> *** creating auxiliary files
buildcol.pl> arcinfo::save_info couldn't write
/var/www/gsdl/collect/babel/archives/HASH73fe.dir/doc.xml/archives.inf
buildcol.pl> Command failed.


thank you again

Ruben


Michael Dewsnip wrote:
> Hi Ruben,
>
> It turns out your problem is caused by a bug in ISISPlug -- obviously
> you're the first person to try it on a database with non-ASCII
> characters in the field names! (The .fdt file wasn't being read using
> the encoding provided).
>
> I've fixed this; you can download a new version of ISISPlug.pm from
> http://www.cs.waikato.ac.nz/~mdewsnip/greenstone/temp-2.63/ISISPlug.pm
> (this should overwrite your existing ISISPlug.pm file in Greenstone's
> "perllib/plugins" directory).
>
> Regards,
>
> Michael
>
> PS Your database seems to be a bit inconsistent: it contains data for
> tags that are not defined in the .fdt file. For example, the .mst file
> seems to have two Date tags: 45 and 50, but only 50 is defined in the
> .fdt file.
>
>
>
> ruben pandolfi wrote:
>
>
>>Hi,
>>
>>John R. McPherson wrote:
>>
>>
>>>
>>>Normally, a "not well-formed" error in the XML Parser means that a
>>>source file has badly encoded data, and the plugin has not detected
>>
>>this
>>
>>>and has made a non-utf8 archive .xml file. It might also mean that the
>>>plugin has used or passed in an invalid xml tag.
>>
>>
>>yes, I can see there is an encoding problem.
>>
>>Anyway , I have set GAPplug ArcPlug RecPlug and isisPlug to dos 850
>>
>>(I'm 50 % sure this is the correct code , altough I thought it was
>>called ibm 850 )
>>
>>It contains italian, french and portuguese characters.
>>
>>
>>
>>>Most of the plugins are careful enough to convert any wrongly encoded
>>>metadata/text into the correct encoding, so perhaps the ISIS plugin
>>>doesn't. Are you able to make your input documents available for
>>
>>testing?
>>
>>>That might be the quickest way for a developer to work out where the
>>>problem is.
>>
>>
>>if someone have time and want to check ;-) , you can temporarly
>>download the complete db isis files here:
>>
>>http://www.evk2cnr.org/ruben/Babel809.zip
>>
>>
>>thank you for your help!
>>
>>ruben
>>
>>John R. McPherson wrote:
>>
>>
>>>On Sat, Feb 11, 2006 at 02:54:15PM +0100, ruben pandolfi wrote:
>>>
>>>
>>>>Jonathan Gorman wrote:
>>>>
>>>>
>>>>>Check "How do I fix XML::Parser errors during import.pl?" in the FAQ.
>>>>>
>>>>>Jon Gorman
>>>>>
>>>>
>>>>
>>>>
>>>>Thank you Jon,
>>>>
>>>>I do not think the error is due to perl.
>>>>
>>>>Infact I only have warnings from perl:
>>>>
>>>>buildcol.pl> not well-formed (invalid token) at line 31, column 34,
>>>>byte 1572 at /usr/lib/perl5/XML/Parser.pm line 187
>>>>buildcol.pl> WARNING: No plugin could process
>>>>HASH7bca/b456434f/1d719200/0bs809.dir/doc.xml
>>>>
>>>
>
>

--
..................

Ruben Pandolfi

-------------------------------------------------------------
"...I Think This is the Beginning of a Beautiful Friendship."
-------------------------------------------------------------