Re: [greenstone-devel] build problem

From Katherine Don
DateWed, 21 Jul 2004 11:37:33 +1200
Subject Re: [greenstone-devel] build problem
In-Reply-To (5-2-0-9-2-20040720153950-019db3e8-tofa-pobox-stanford-edu)
Hi Tom

Are you working with Stuart Snydman? Or are there two of you at Stanford
with the same problem??

I don't know what the problem is, so here are some questions for you :-)
(when I say entire collection in the following, I guess I mean more than
23 documents)

* What version of Greenstone are you using?
* Can you build the entire 200 documents using the default 'new
collection' settings?
* When you were changing the configuration, did you try the entire
collection at any stage during that process?
* Please try turning off 'enable advanced searches' in the search types
pane - can you build the entire collection then? Different search
engines are used when this is on/off.
* Have you tried building from the command line? Do you still get teh
same error?
(cd to gsdl directory, run setup, then perl -S buildcol.pl <collname>)
* Do you get any errors in the import stage?
* Try setting the verbosity to 3 to see if you get any more information.
* Is there a file fail.log in gsdl/collect/<collname>/etc, and is it
writeable?
* What error messages do you get if you build, in GLI, without one of
the necessary plugins (Word, PDF or HTML plugs) - does it still fail the
first time an unprocessable document is reached?

Thats all I can think of at this stage.
I hope we can sort this out for you.

Regards,
Katherine Don

Tom Farrell wrote:
> No response from the user's list, so I thought I'd try here:
>
> Hi all,
>
> We are attempting to build a collection of around 200 documents through
> the GLI - mostly
> Word docs, with some PDFs and a few HTML pages as well. We've added
> external dc metadata to the files.
>
> Systems are Dells running W2000, with the Greenstone installation of
> Perl, Norton disabled.
>
> The collection builds with a default "new collection" setting. We then add
> browsing CLV lists for Author, Title, and Date drawn from the metadata, and
> search indexes for those fields, as well as the default text index.
>
> We do a build at each step of the way, using a maxdocs of 20, to make sure
> it works. It does, and the collection looks and behaves well. The problem
> is that when we increase the number of docs in the build to anything over
> 22, the build fails with the error:
>
> "buildcol.pl> GAPLug: processing HASH47f5.dirdoc.xml
> buildcol.pl> WARNING: No plugin could process HASH47f5.dirdoc.xml
> buildcol.pl> Not a GLOB reference at C:Program
> Filesgsdl/perllib/gsprintf.pm line 61.
> buildcol.pl> Command failed."
>
> It doesn't matter which actual document is processed as number 23; the
> error always appears at that point.
>
> Anyone have any ideas - it's a bit frustrating.
>
> Thanks,
>
> Tom
>
> Here is the .cfg, and the full build log from on of the failures:
>
> creator greenstone@cs.waikato.ac.nz.waikato.ac.nz
> maintainer greenstone@cs.waikato.ac.nz.waikato.ac.nz
> public true
>
> searchtype form plain
>
> #indexes document:text document:Title document:Source
> indexes text dc.Contributor dc.Title dc.Date
> levels document
> #defaultindex document:text
>
> plugin ZIPPlug
> plugin GAPlug
> plugin TEXTPlug
> plugin HTMLPlug -nolinks
> plugin EMAILPlug
> plugin PDFPlug -convert_to html
> plugin RTFPlug
> plugin WordPlug
> plugin PSPlug
> plugin ArcPlug
> plugin RecPlug -use_metadata_files
>
> classify AZList -metadata dc.Title -buttonname Title
>
> classify AZList -metadata dc.Contributor -buttonname Creator
>
> classify DateList -metadata dc.Date
>
> format DateList "<td>[link][icon][/link]</td>
> <td>[highlight]{Or}{[dls.Title],[dc.Title],[Title],Untitled}[/highlight]
> </td>
> <td>[Date]</td>"
>
> format HList
> "[link][highlight]{Or}{[dls.Title],[dc.Title],[Title],Untitled}
> [/highlight][/link]"
>
> format VList "<td valign=top>[link][icon][/link]</td>
> <td valign=top>[srclink]{Or}{[thumbicon],[srcicon]}[/srclink]</td>
> <td valign=top>[highlight]
> {Or}{[dls.Title],[dc.Title],[Title],Untitled}
> [/highlight]{If}{[Source],<br><i>([Source])</i>}</td>"
>
> format CL1VList "<td valign=top>[link][icon][/link]</td>
> <td valign=top>[srclink]{Or}{[thumbicon],[srcicon]}[/srclink]</td>
> <td valign=top>[highlight]
> {Or}{[dc.Title],[Title],Untitled}
> [/highlight]{If}{[Source],<br><i>([Source])</i>}</td>"
>
> format CL2VList "<td valign=top>[link][icon][/link]</td>
> <td valign=top>[srclink]{Or}{[thumbicon],[srcicon]}[/srclink]</td>
> <td valign=top>[highlight]
> {Or}{,[dc.Contributor],[Title],Untitled}
> [/highlight]{If}{[Source],<br><i>([Source])</i>}</td>"
>
> format CL3VList "<td valign=top>[link][icon][/link]</td>
> <td valign=top>[srclink]{Or}{[thumbicon],[srcicon]}[/srclink]</td>
> <td valign=top>[highlight]
> {Or}{[dc.Date],[Title],Untitled}
> [/highlight]{If}{[Source],<br><i>([Source])</i>}</td>"
>
> collectionmeta collectionname [l=en] "computer games"
> collectionmeta collectionextra [l=en] "Papers by students in STS 145,
> the
> history of computer games, taught by Henry Lowood."
> collectionmeta .document:text [l=en] "text"
> collectionmeta .document:Title [l=en] "titles"
> collectionmeta .document:Source [l=en] "filenames"
> collectionmeta .text [l=en] "text"
> collectionmeta iconcollection
> [l=en] "/gsdl/collect/computer/images/softline383.jpg"
> collectionmeta .dc.Contributor [l=en] "Author"
> collectionmeta .dc.Title [l=en] "Title"
> collectionmeta .dc.Date [l=en] "Date"
>
> --
> ..
> Extracted 9 pieces of metadata for HASHf7a3.dir.
> import.pl> Archived metadata extraction complete.
> Command: C:Program FilesgsdlbinwindowsperlbinPerl.exe -S C:Program
> Filesgsdlbinscriptbuildcol.pl -gli -language en -collectdir C:Program
> Filesgsdlcollect sts145co
> buildcol.pl> doclevel = document
> buildcol.pl> *** creating the compressed text
> buildcol.pl> collecting text statistics (mgpp_passes -T1)
> buildcol.pl> ArcPlug: processing C:Program
> Filesgsdlcollectsts145coarchivesarchives.inf
> buildcol.pl> GAPLug: processing HASH0199.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH01b2.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH0108.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH0104.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH01b20524a893.dirdoc.xml
> buildcol.pl> GAPLug: processing HASHa55c.dirdoc.xml
> buildcol.pl> GAPLug: processing HASHdfed.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH0135.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH8b21.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH07b6.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH01af.dirdoc.xml
> buildcol.pl> GAPLug: processing HASHf7a3.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH80fc.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH01d1.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH0135bcce6cd0.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH0170.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH01e2.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH0172.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH01e4.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH01ab.dirdoc.xml
> buildcol.pl> GAPLug: processing HASHb260.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH5ea2.dirdoc.xml
> buildcol.pl> GAPLug: processing HASH47f5.dirdoc.xml
> buildcol.pl> WARNING: No plugin could process HASH47f5.dirdoc.xml
> buildcol.pl> Not a GLOB reference at C:Program
> Filesgsdl/perllib/gsprintf.pm line 61.
> buildcol.pl> Command failed.
>
>
> _______________________________________________
> greenstone-devel mailing list
> greenstone-devel@list.scms.waikato.ac.nz
> https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-devel
>