Re: [greenstone-users] problem with building "large" collections [SOLVED! - not really]

From Michael Dewsnip
DateWed, 16 Aug 2006 10:49:05 +1200
Subject Re: [greenstone-users] problem with building "large" collections [SOLVED! - not really]
In-Reply-To (44E16F4C-4030204-gmx-net)
Hi Jens,

What type of classifiers do you have in the collection? I've had a
report in the past that the Hierarchy classifier uses an unreasonable
amount of memory, and other classifiers might too.

A couple of things to try:

- Run -mode infodb, just to make sure stuff isn't being left
in memory from previous phases

- Comment out the classifiers in the collection -- does it build
successfully now?

Also, what is the size of the index/text/<collection>.ldb file in the
largest collection you've been able to build successfully?

All the best,


jens wille wrote:

>hi michael!
>Michael Dewsnip [15.08.2006 00:46]:
>>This sounds like a format statement problem. When formatting a
>>classifier VList you need to treat classifier nodes differently
>>from leaf (document) nodes, since classifier nodes don't have any
>>metadata except Title (eg. they don't have DocOID metadata
>>because they aren't documents!).
>>Typically you use a "{If}{[numleafdocs],<Format code for
>>classifier nodes>,<Format code for document nodes>}" structure
>>for hierarchical classifiers, since "[numleafdocs]" is only
>>defined for classifier nodes.
>well, i know that, i always use it that way ;-) but this lead me to
>check whether there are nodes or leafs on the first level: and those
>are nodes! reason: the default for -mingroup changed in v2.70w to 1
>instead of 2 in v2.62. since 2 was what i wanted i didn't specify a
>-mingroup parameter in my collect.cfg for this particular classifier
>- and that has made all the difference ;-)
>having solved this "minor" problem, i need to return to the original
>one: my declaring it as solved was a bit premature :-( providing
>more virtual memory did a good job for 6 volumes, but even 11 GB
>(1+10) weren't enough for 9 volumes (at least with v2.70w i now get
>an error message "Out of memory!" from this raises the
>question: how much memory does greenstone require at all?!? frankly,
>i can't believe that 11 GB shouldn't suffice for /any/ greenstone
>job ;-) however, maybe 1 GB of real RAM just isn't enough, so that
>adding any amount of swap space won't make any further difference?
>is it then possible to reduce the amount of memory required? or do
>you/does anybody have any other suggestions?
>thanks a lot so far!
>greenstone-users mailing list