[greenstone-devel] Hierarchy Classifier format

From Nicolás Rucks
DateSat Oct 2 04:01:56 2010
Subject [greenstone-devel] Hierarchy Classifier format
Hi everybody.
I am working with GSDL 2.81 on a Linux server from shell (import.pl , buidlcol.pl, etc.)

Right now I am experimenting with the Hierarchy Classifier.

I would like to know how/where I could control the Hierarchy Classifier format.

This is where I got so far:
- On one hand I work with a hierarchy file. Here is my "classify" line in collect.cfg:
classify Hierarchy -hfile jerarquia.txt -metadata Subject.hier -sort dc.Title -allvalues -documents_last

This is because I want to be able to create links to specific categories of the hierarchy, hence the need to know in advance which class number (e.g. 2.1.3) will be assigned to any given category.

On the other hand, I do have some empty categories (no documents still assigned to a specific category).
Using "buildcol.pl -remove_empty_classifications " is useless, because if I do, class numbers coming from the hierarchy file (jerarquia.txt ) are ignored and instead assigned sequentially, not assigning class numbers to empty classes, so I miss the whole point of being able to make links to specific categories.

So what is my problem with the Hierarchical list?
I would like the format (format CL4VList "...") to ignore empty classes, that is, not to show them when they do not have documents.

>From what I have been able to see, the "classify Hierarchy"'s format behavior is as follows:
A table <table>...</table> is created by GSDL from "outside" the format.
For each category, empty or not, a row <tr>...</tr> is created by GSDL from "outside" the format.
The format CL4VList "..." is, by default, the same as for any VList :

<td valign="top">[link][icon][/link]</td><td valign="top">[srclink]{Or}{[thumbicon],[srcicon]}[/srclink]</td><td valign="top">[highlight]{Or}{[dls.Title],[dc.Title],[Title],Untitled}[/highlight]{If}{[Source],<br><i>([Source])</i>}</td>

So I tried to write a format that would only appear for non-empty classes, but GSDL's behavior is somewhat odd:

a) If the format of a hierarchy list starts with {if} then the <table></table> and <tr></tr> are NOT generated!
But that only happens at the first level of the hierarchy. From the second level on, the <table></table> and <tr></tr> appear again.

format CL4VList "
{If}{[numleafdocs] > 0,
blablabla (format for non-empty classes)

b) If the format of a hierarchy list starts with <td>, etc... and I put {If}{[numleafdocs] > 0,} inside of it, then format will be OK, but I will not be able to make empty classes disappear, since the <tr></tr> is generated from outside the format. Note that empty rows of an HTML table are rendered and will show as some little space before the next rows. If you have more rows, you will get more space.

As I said, I am working on GSDL 2.81, but I have been told that the same happens on GSDL 2.83.

By the way, the tables generated by GSDL control the indent of each level. Suppose I wanted to control that too.

So, what can I do to if I want to control the Hierarchy Classifier format?

Thanks a lot in advance.

Nicolas Rucks
Biblioteca Cardini
Fundaci?n Instituto Leloir