[greenstone-users] Re: Hierarchy classifier - sorting

From Katherine Don
DateSun Nov 21 06:49:55 2010
Subject [greenstone-users] Re: Hierarchy classifier - sorting
In-Reply-To (4CE6DEAD-50708-comune-belluno-it)
Hi William

Did you not get hfile working for hierarchy classifier?
If you don't use an hfile, and just use the | separator, then it will do
alphabetical sorting. It can't do custom sorting as how would it know what
order you want?

You can do custom sorting using the hfile.

eg hfile like:

pets 1 Pets
dog 1.1 Dogs
cat 1.2 Cats
wild 2 Wild Animals
tiger 2.1 Tigers
aardvark 2.2 Aardvarks

Would give a structure like

Pets
- Dogs
- Cats
Wild Animals
- Tigers
- Aardvarks

You need to give your documents metadata values like 'pets' or 'dogs' etc.

The first entry in each line of the hfile is the descriptor, which is the
value that needs to be set as metadata in the documents. Depending on how
you want to add metadata, you can change what you use for the descriptor.

eg could do

1 1 Pets
1.1 1.1 Dogs
1.2 1.2 Cats

Then you will be assigning metadata 1, 1.1, 1.2 etc

Alternatively, you can encode the whole structure in the metadata.

Pets 1 Pets
Pets|Dogs 1.1 Dogs
Pets|Cats 1.2 Cats

Then you will be assigning metadata like Pets, Pets|Dogs etc.
This may be what you already have? So create a hfile that gives the
correct ordering and you should be fine.

I hope this helps,
Katherine

> Hi Katherine,
>
> Thanks for all the help you've given me. I've started refreshing my Perl
> and C++ so I can take a look at the source code and maybe start
> contributing to the project.
>
> That said I have one last question (at least for now): where would I
> need to intervene to make my hierarchy classifier output it's values in
> an order specified by me (keep in mind that I'm using this with the
> PagedImage plugin). Let's say I've set up my .item file like this:
>
> <Title>Whatever
> <Subject and Keywords>01_test|02_about|03_zesty
> 1:file.jpg
> 2:file2.jpg
> ....
>
> and
>
> <Title>Try Again
> <Subject and Keywords>02_about|01_test|02_testy
> 1:file.jpg
> 2:file2.jpg
> ...
>
> Now when I look at the classifier on the site I get this:
>
> 01_test
> 02_about
>
> which is the order that I need. The problem is that this corresponds to
> the directory structure and is not very user friendly. However if I put
> things the way I need them:
>
> <Title>Whatever
> <Subject and Keywords>Test|About|Zesty
> 1:file.jpg
> 2:file2.jpg
> ....
>
> and
>
> <Title>Try Again
> <Subject and Keywords>About|Test|Testy
> 1:file.jpg
> 2:file2.jpg
> ...
>
> Now with the classifier I get this:
>
> About
> This
>
> which is not the order I want. What I'd like is the same order as before
> (like my directory structure) and that is:
>
> This
> About
>
> It seems like the hierarchy classifier puts everything in alphabetical
> order no matter what I do. Is it possible for me to change some code or
> something to keep the order the same as the structure I have on disk?
>
> Thanks again for all the help and have a nice weekend.
>
> --
> William
>
>
>