Re: [greenstone-users] building collection failure with OAI harvesting

From Katherine Don
DateFri, 02 Feb 2007 11:11:51 +1300
Subject Re: [greenstone-users] building collection failure with OAI harvesting
In-Reply-To (45C23416-3020803-uiuc-edu)
Hi Hong

You need to run 'import.pl oaitest' before running buildcol.pl.
importfrom will download the oai documents into the import directory,
import converts them into XML form in archives directory, then buildcol
uses the archive files to create the indexes.

Regards,
Katherine

Hong Zhang wrote:
> Hello,
>
> I'm trying to use OAI harvesting to import and build DL on an Unix
> server. But a same problem happended when I was running buildcol.pl in
> both GSDL V2.70 and V2.72, even though importfrom.pl worked very well. I
> used the example in
> http://nzdl.sadl.uleth.ca/cgi-bin/library?a=p&p=about&c=oai-e
> and the commands I've been using are:
> 0) source setup.bash
> 1) mkcol.pl -creator hzhang1@uiuc.edu oaitest
> 2) add a line:
> acquire OAI -src rocky.dlib.vt.edu/~jcdlpix/cgi-bin/OAI/jcdlpix.pl
> -getdoc
> into collect/oaitest/etc/collect.cfg. I also modified the
> collectionname.
> 3) importfrom.pl oaitest
> 4) buildcol.pl oaitest
>
> After running importfrom.pl, I checked the import/ directory and found
> the .oai files under import/oai/JCDLPICS/ and downloaded .jpg files
> under import/oai/JCDLPICS/srcdocs/). But the buildcol.pl returned the
> error message I posted below. What could be the problem? I tried wget
> and it works very well so I'm assuming it should not be a network problem.
>
> Btw, is the instruction in
> http://greenstone.sourceforge.net/wiki/gsdoc/tutorial/en/OAI_downloading.htm
>
> still appliable for V2.72? I know the above link
> (http://nzdl.sadl.uleth.ca/cgi-bin/library?a=p&p=about&c=oai-e) is
> apparently for old version GS since RecPlug has to be replaced with
> MetadataXMLPlug. The two sites are the only information I can find for
> OAI harvesting to build GSDL collection now. Any suggestions?
>
>
> Thanks a lot,
> Hong
>
>
> ============= Message returned by buildcol.pl ========
> *** creating the compressed text
>
> collecting text statistics
> /homeb/hzhang1/usr/htdocs/gsdl/perllib/strings.properties
> WARNING: No plugin could recognise
> Stats (Compressing text from section:text)
> Total bytes in collection: 0
> Total bytes in section:text: 0
> ***************
> WARNING: There is very little or no text to compress
> Was this your intention?
> ***************
>
> creating the compression dictionary
>
> compressing the text
> WARNING: No plugin could recognise
> Stats (Compressing text from section:text)
> Total bytes in collection: 0
> Total bytes in section:text: 0
> ***************
> WARNING: There is very little or no text to compress
> Was this your intention?
> ***************
>
> *** building index document:Description in subdirectory dde
>
> creating index dictionary
> WARNING: No plugin could recognise
> ivf.pass1 : Error during done of "ivf.pass1"
> Stats (Creating index document:Description)
> Total bytes in collection: 0
> Total bytes in document:Description: 0
> ***************
> WARNING: There is very little or no text to process for
> document:Description
> Was this your intention?
> ***************
> mgbuilder::build_index - Couldn't create index document:Description
> BuildDir: /homeb/hzhang1/usr/htdocs/gsdl/collect/oaitest/building
>
> *** creating the info database and processing associated files
> WARNING: No plugin could recognise
> *** outputting information for classifier: CL1
> *** outputting information for classifier: CL2
> *** outputting information for classifier: oai
>
> *** creating auxiliary files
>
>
>
> _______________________________________________
> greenstone-users mailing list
> greenstone-users@list.scms.waikato.ac.nz
> https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users
>
>