[greenstone-devel] sorting leaves in AZCompactList classifier (and patch?)

From Stephen.DeGabrielle@ntu.edu.au
DateThu, 10 Jul 2003 12:07:25 +0930
Subject [greenstone-devel] sorting leaves in AZCompactList classifier (and patch?)
Hi, we are using AZCompactList to create a two level listing with our
ereserve items sorted by unitcode and lecturer. We discovered that the
leaves of the AZCompactList in this case were unsorted -
In our case it is desirable to have leaves sorted by title for easy
retrival by students, so after hunting around the classifiers code we have
a small modification of our own (Though probably not for everybody)
push @args, ("-sort", "Title");

this allows us to sort on the title field in the leaves.
In most cases this works fine but one group of documents is behaving
strangely they have extra metadata 'AltTitle' in addition to title. (Not DC
standard)
The 'AltTitle' metadata is used to supply an english translation of a
foreign language 'Title' metadata. The other documents don't have
'AltTitle' and are sorting fine, but the 'AltTitle' ones end up being put
first in the sorted list, followed by the items without and 'AltTitle'
(Example follows)

Has anyone encountered this problem? Know how to resolve it?
(copies of output,code,and metadata follow)

Regards,

Stephen

PS How do the maintainers feel about me changing AZCompactList to pass a '
-sortleaves <metadataname>' instead of just this little patch?

s.


-greenstone browser output showing problem
copied from screen-my comments in brackets-

THTFTG03B (branch)
(leaves follow..)
Kakadu National Park: ihr ferlenpianer
Lecturer: Hutton, Ian
Parc National de Kakadu: plans de vacances
Lecturer: Hutton, Ian
Parque Nacional de Kakadu: planificador de vacaciones
Lecturer: Hutton, Ian
Piano per le vacanze nel parco nazionale Kakadu
Lecturer: Hutton, Ian (end of 'AltTitle leaves/start of others)
Kakadu: a guide for all seasons
Lecturer: Hutton, Ian
Kakadu National Park entry fees leaflet
Lecturer: Hutton, Ian
Kakadu National Park: holiday planner
Lecturer: Hutton, Ian
Kakadu National Park: holiday planner
Lecturer: Hutton, Ian
Park notes: Bowali visitor centre
Lecturer: Hutton, Ian
Welcome to the Aboriginal lands of Kakadu National Park: visitor guide and
maps
Lecturer: Hutton, Ian
(...end of leaves)

---


-
-sub reinit from AZCompactList.pm-
sub reinit
{
my ($self,$classlist_ref) = @_;
my $outhandle = $self->{'outhandle'};

my %mtfreq = ();
my @single_classlist = ();
my @multiple_classlist = ();

# find out how often each metavalue occurs
map
{
my $mv;
foreach $mv (@{$self->{'listmetavalue'}->{$_}} )
{
$mtfreq{$mv}++;
}
} @$classlist_ref;

# use this information to split the list: single metavalue/repeated
value
map
{
my $i = 1;
my $metavalue;
foreach $metavalue (@{$self->{'listmetavalue'}->{$_}})
{
if ($mtfreq{$metavalue} >= $self->{'mingroup'})
{
push(@multiple_classlist,[$_,$i,$metavalue]);
}
else
{
push(@single_classlist,[$_,$metavalue]);
$metavalue =~ tr/[A-Z]/[a-z]/;
$self->{'reclassifylist'}->{"Metavalue_$i.$_"} = $metavalue;
}
$i++;
}
} @$classlist_ref;


# Setup sub-classifiers for multiple list

$self->{'classifiers'} = {};

my $pm;
foreach $pm ("List", "SectionList")
{
my $listname
= &util::filename_cat($ENV
{'GSDLHOME'},"perllib/classify/$pm.pm");
if (-e $listname) { require $listname; }
else
{
print $outhandle "AZCompactList ERROR - couldn't find classifier
"$listname" ";
die " ";
}
}

# Create classifiers objects for each entry >= mingroup
my $metavalue;
foreach $metavalue (keys %mtfreq)
{
if ($mtfreq{$metavalue} >= $self->{'mingroup'})
{
my $listclassobj;
my $doclevel = $self->{'doclevel'};
my $metaname = $self->{'metaname'};
my @metaname_list = split('/',$metaname);
$metaname = shift(@metaname_list);
if (@metaname_list==0)
{
my @args;
push @args, ("-metadata", "$metaname");
# buttonname is also used for the node's title
push @args, ("-buttonname", "$metavalue");
push @args, ("-sort", "Date");
###############################################################################
## SORT LEAVES (s.degabrielle/I.Rohoza 2003)
push @args, ("-sort", "Title");
################################################################
if ($doclevel =~ m/^top(level)?/i)
{
eval ("$listclassobj = new List(@args)"); warn $@ if $@;
}
else
{
eval ("$listclassobj = new SectionList(@args)");
}
}
else
{
$metaname = join('/',@metaname_list);

my @args;
push @args, ("-metadata", "$metaname");
# buttonname is also used for the node's title
push @args, ("-buttonname", "$metavalue");
push @args, ("-doclevel", "$doclevel");
push @args, "-recopt";


eval ("$listclassobj = new AZCompactList(@args)");
}
if ($@) {
print $outhandle "$@";
die " ";
}

$listclassobj->init();

if (defined $metavalue && $metavalue =~ /w/)
{
my $formatted_node = $metavalue;

if ($self->{'removeprefix'}) {
$formatted_node =~ s/^$self->{'removeprefix'}//;
}

if ($self->{'metaname'} =~ m/^Creator(:.*)?$/)
{
&sorttools::format_string_name_english($formatted_node);
}
else
{
&sorttools::format_string_english($formatted_node);
}

# In case our formatted string is empty...
if (! defined($formatted_node)) {
print $outhandle "Warning: AZCompactList: metavalue is ";
print $outhandle "empty ";
$formatted_node="";
}

$self->{'classifiers'}->{$metavalue}
= { 'classifyobj' => $listclassobj,
'formattednode' => $formatted_node };
}
}
}


return (@single_classlist,@multiple_classlist);
}
---------

-my metadata file...-
<?xml version="1.0" ?>
<!DOCTYPE GreenstoneDirectoryMetadata SYSTEM
"http://greenstone.org/dtd/GreenstoneDirectoryMetadata/1.0/GreenstoneDirectoryMetadata.dtd">
<DirectoryMetadata>
<FileSet>
<FileName>03B-001.pdf</FileName>
<Description>
<Metadata name="Title" mode="accumulate">Kakadu National Park:
holiday planner</Metadata>
<Metadata name="Creator" mode="accumulate">Kakadu National
Park</Metadata>
<Metadata name="AllCreators">Kakadu National Park</Metadata>
<Metadata name="Year" mode="accumulate">2001</Metadata>
<Metadata name="DocumentLanguage" mode
="accumulate">English</Metadata>
<Metadata name="Unitcode" mode="accumulate">THTFTG03B</Metadata>
<Metadata name="AllUnitcodes">THTFTG03B</Metadata>
<Metadata name="Lecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="AllLecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="BarcodeNo"></Metadata>
<Metadata name="CreationDate"></Metadata>
</Description>
</FileSet>
<FileSet>
<FileName>03B-002.pdf</FileName>
<Description>
<Metadata name="Title" mode="accumulate">Piano per le vacanze nel
parco nazionale Kakadu</Metadata>
<Metadata name="AltTitle" mode="accumulate">Kakadu National Park:
holiday planner</Metadata>
<Metadata name="Creator" mode="accumulate">Kakadu National
Park</Metadata>
<Metadata name="AllCreators">Kakadu National Park</Metadata>
<Metadata name="Year" mode="accumulate">2000</Metadata>
<Metadata name="DocumentLanguage" mode
="accumulate">Italian</Metadata>
<Metadata name="Unitcode" mode="accumulate">THTFTG03B</Metadata>
<Metadata name="AllUnitcodes">THTFTG03B</Metadata>
<Metadata name="Lecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="AllLecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="BarcodeNo"></Metadata>
<Metadata name="CreationDate"></Metadata>
</Description>
</FileSet>
<FileSet>
<FileName>03B-003.pdf</FileName>
<Description>
<Metadata name="Title" mode="accumulate">Kakadu National Park:
holiday planner</Metadata>
<Metadata name="Creator" mode="accumulate">Kakadu National
Park</Metadata>
<Metadata name="AllCreators">Kakadu National Park</Metadata>
<Metadata name="Year" mode="accumulate">2000</Metadata>
<Metadata name="DocumentLanguage" mode
="accumulate">Japanese</Metadata>
<Metadata name="Unitcode" mode="accumulate">THTFTG03B</Metadata>
<Metadata name="AllUnitcodes">THTFTG03B</Metadata>
<Metadata name="Lecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="AllLecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="BarcodeNo"></Metadata>
<Metadata name="CreationDate"></Metadata>
</Description>
</FileSet>
<FileSet>
<FileName>03B-004.pdf</FileName>
<Description>
<Metadata name="Title" mode="accumulate">Kakadu National Park: ihr
ferlenpianer</Metadata>
<Metadata name="AltTitle" mode="accumulate">Kakadu National Park:
holiday planner</Metadata>
<Metadata name="Creator" mode="accumulate">Kakadu National
Park</Metadata>
<Metadata name="AllCreators">Kakadu National Park</Metadata>
<Metadata name="Year" mode="accumulate">2001</Metadata>
<Metadata name="DocumentLanguage" mode="accumulate">German</Metadata>
<Metadata name="Unitcode" mode="accumulate">THTFTG03B</Metadata>
<Metadata name="AllUnitcodes">THTFTG03B</Metadata>
<Metadata name="Lecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="AllLecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="BarcodeNo"></Metadata>
<Metadata name="CreationDate"></Metadata>
</Description>
</FileSet>
<FileSet>
<FileName>03B-005.pdf</FileName>
<Description>
<Metadata name="Title" mode="accumulate">Parc National de Kakadu:
plans de vacances</Metadata>
<Metadata name="AltTitle" mode="accumulate">Kakadu National Park:
holiday planner</Metadata>
<Metadata name="Creator" mode="accumulate">Kakadu National
Park</Metadata>
<Metadata name="AllCreators">Kakadu National Park</Metadata>
<Metadata name="Year" mode="accumulate">2000</Metadata>
<Metadata name="DocumentLanguage" mode="accumulate">French</Metadata>
<Metadata name="Unitcode" mode="accumulate">THTFTG03B</Metadata>
<Metadata name="AllUnitcodes">THTFTG03B</Metadata>
<Metadata name="Lecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="AllLecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="BarcodeNo"></Metadata>
<Metadata name="CreationDate"></Metadata>
</Description>
</FileSet>
<FileSet>
<FileName>03B-006.pdf</FileName>
<Description>
<Metadata name="Title" mode="accumulate">Parque Nacional de Kakadu:
planificador de vacaciones</Metadata>
<Metadata name="AltTitle" mode="accumulate">Kakadu National Park:
holiday planner</Metadata>
<Metadata name="Creator" mode="accumulate">Kakadu National
Park</Metadata>
<Metadata name="AllCreators">Kakadu National Park</Metadata>
<Metadata name="Year" mode="accumulate">2002</Metadata>
<Metadata name="DocumentLanguage" mode
="accumulate">Spanish</Metadata>
<Metadata name="Unitcode" mode="accumulate">THTFTG03B</Metadata>
<Metadata name="AllUnitcodes">THTFTG03B</Metadata>
<Metadata name="Lecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="AllLecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="BarcodeNo"></Metadata>
<Metadata name="CreationDate"></Metadata>
</Description>
</FileSet>
<FileSet>
<FileName>03B-007.pdf</FileName>
<Description>
<Metadata name="Title" mode="accumulate">Park notes: Bowali visitor
centre</Metadata>
<Metadata name="Creator" mode="accumulate">Kakadu National
Park</Metadata>
<Metadata name="AllCreators">Kakadu National Park</Metadata>
<Metadata name="Year" mode="accumulate">2001</Metadata>
<Metadata name="DocumentLanguage" mode
="accumulate">English</Metadata>
<Metadata name="Unitcode" mode="accumulate">THTFTG03B</Metadata>
<Metadata name="AllUnitcodes">THTFTG03B</Metadata>
<Metadata name="Lecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="AllLecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="BarcodeNo"></Metadata>
<Metadata name="CreationDate"></Metadata>
</Description>
</FileSet>
<FileSet>
<FileName>03B-008.pdf</FileName>
<Description>
<Metadata name="Title" mode="accumulate">Kakadu: a guide for all
seasons</Metadata>
<Metadata name="Creator" mode="accumulate">Kakadu National
Park</Metadata>
<Metadata name="AllCreators">Kakadu National Park</Metadata>
<Metadata name="Year" mode="accumulate">2002</Metadata>
<Metadata name="DocumentLanguage" mode
="accumulate">English</Metadata>
<Metadata name="Unitcode" mode="accumulate">THTFTG03B</Metadata>
<Metadata name="AllUnitcodes">THTFTG03B</Metadata>
<Metadata name="Lecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="AllLecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="BarcodeNo"></Metadata>
<Metadata name="CreationDate"></Metadata>
</Description>
</FileSet>
<FileSet>
<FileName>03B-009.pdf</FileName>
<Description>
<Metadata name="Title" mode="accumulate">Welcome to the Aboriginal
lands of Kakadu National Park: visitor guide and maps</Metadata>
<Metadata name="Creator" mode="accumulate">Kakadu National
Park</Metadata>
<Metadata name="AllCreators">Kakadu National Park</Metadata>
<Metadata name="Year" mode="accumulate">2002</Metadata>
<Metadata name="DocumentLanguage" mode
="accumulate">English</Metadata>
<Metadata name="Unitcode" mode="accumulate">THTFTG03B</Metadata>
<Metadata name="AllUnitcodes">THTFTG03B</Metadata>
<Metadata name="Lecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="AllLecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="BarcodeNo"></Metadata>
<Metadata name="CreationDate"></Metadata>
</Description>
</FileSet>
<FileSet>
<FileName>03B-010.pdf</FileName>
<Description>
<Metadata name="Title" mode="accumulate">Kakadu National Park entry
fees leaflet</Metadata>
<Metadata name="AltTitle" mode="accumulate">Changes to park entry
fees, Kakadu National Park, valid from 1 Jan 2002</Metadata>
<Metadata name="Creator" mode="accumulate">Kakadu National
Park</Metadata>
<Metadata name="AllCreators">Kakadu National Park</Metadata>
<Metadata name="Year" mode="accumulate">2002</Metadata>
<Metadata name="DocumentLanguage" mode
="accumulate">English</Metadata>
<Metadata name="Unitcode" mode="accumulate">THTFTG03B</Metadata>
<Metadata name="AllUnitcodes">THTFTG03B</Metadata>
<Metadata name="Lecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="AllLecturer" mode="accumulate">Hutton, Ian</Metadata>
<Metadata name="BarcodeNo"></Metadata>
<Metadata name="CreationDate"></Metadata>
</Description>
</FileSet>
</DirectoryMetadata>
---