[greenstone-devel] AZCompactList.pm error

From Gregory S. Williamson
DateTue, 30 Nov 2004 19:18:28 -0800
Subject [greenstone-devel] AZCompactList.pm error
I solved some earlier problems by upgrading to the most recent stable release of greenstone (and then upgrading perl to v5.8.5 by building it on this Redhat Linux box, version 2.4.20-8). Current version of Greenstone would seem to be:
[root@localhost gsdl]# pwd
[root@localhost gsdl]# more VERSION
Version: 2.52
Media: Web
OS: linux
Language: en
Option: Source

I ran mkcol.pl and import.pl happily (both with minimum parameters), but buildcol.pl ran for a while and then died with:
GAPlug: processing HASH013e/5d0cd055.dir/doc.xml
GAPlug: processing HASH01ac/28bc34f5.dir/doc.xml
GAPlug: processing HASH02b1.dir/doc.xml
GAPlug: processing HASH0149.dir/doc.xml
Use of uninitialized value in array dereference at /usr/local/gsdl/perllib/classify/AZCompactList.pm line 852.
Use of uninitialized value in concatenation (.) or string at /usr/local/gsdl/perllib/classify/AZCompactList.pm line 854.

*** creating auxiliary files


The last indicated directory/document seem ok to my untutored eye.

The specific lines referred to in the .pm file are:
if (scalar (@currentOIDs) < $min) {
my ($newkey) = $lastkey =~ /^(.)/;
@currentOIDs = (@{$compactedhash->{$lastkey}}, @currentOIDs); # <-- this
delete $compactedhash->{$lastkey};
@{$compactedhash->{"$newkey-$currentlastletter"}} = @currentOIDs; # <-- And this
} else {
if ($currentfirstletter eq $currentlastletter) {
@{$compactedhash->{$currentfirstletter}} = @currentOIDs;
} else {
@{$compactedhash->{"$currentfirstletter-$currentlastletter"}} = @currentOIDs;


The collection config file (basically taken from an earlier version of this project and gsdl) looks like this:

creator ccarlsson@shapingsf.org
maintainer ccarlsson@shapingsf.org
public true

indexes document:text document:Title document:Source document:Subject document:Author document:Period
defaultindex document:text

plugin ZIPPlug
plugin GAPlug
plugin TEXTPlug
plugin HTMLPlug -metadata_fields Subject,Title,Author,Period,BannerImage
plugin EMAILPlug
plugin PDFPlug
plugin RTFPlug
plugin WordPlug
plugin PSPlug
plugin ArcPlug
plugin RecPlug

classify AZList -metadata Title
classify AZList -metadata Source
classify AZCompactList -metadata Subject -mingroup 1
classify AZCompactList -metadata Author -mingroup 1 -buttonname "Contributors"
classify AZCompactList -metadata Period

format DocumentImages false
format DocumentContents false
format DocumentHeading '<img src="/gsdl/images/[BannerImage].jpg">'
format DocumentText '[Text]'

collectionmeta collectionname "Beta Version of Shaping San Francisco on Linux"
collectionmeta iconcollection "/gsdl/images/top_banner-2.gif"
collectionmeta collectionextra ""
collectionmeta .document:text "text"
collectionmeta .document:Title "titles"
collectionmeta .document:Source "filenames"
collectionmeta .document:Subject "subjects"
collectionmeta .document:Period "periods"


The offending XML file (if that is indeed part of the problem) is attached.

Any help or advice would be most welcome!

Thanks for your patience,

Greg Williamson

Type: text/xml
Filename: doc.xml

1101265258 import/italian1$t_prnt_pgnm$italflyer_itm.html indexed_doc en iso_8859_1 HTMLPlug 935 italian1$t&#095;prnt&#095;pgnm$italflyer&#095;itm.html italflyer_itm italflyer_itm WORKERS! Procreate Only When You Like! ... HTML http://italian1$t_prnt_pgnm$italflyer_itm.html HASH01491789cb2383d9909f228d <HTML> <HEAD> </HEAD> <HTML> <TITLE>italflyer_itm</TITLE>italflyer_itm <P> <B>WORKERS! Procreate Only When You Like! <P> Numerous families increase the misery that is great already among the poor masses of workers. The capitalist vampires, by means of the priest, morally condemn the use of scientific means in order not to have children. This they do by threatening "hell" to those who intelligently refuse to put into the world numerous "unlucky" (unfortunate) ones. And by means of politicians, judges and jailers they diffuse among the people scientific knowledge. and indeed they tried, a short time ago, Margherita Sanger. They convicted Anderlini in the state of Illinois. A few days ago they arrested Emma Goldman in New York, and they threaten trouble to all those who have the courage to tell you the truth and let you know this practical means to prevent conception.</B></P>