The collection has 2283 documents encoded in UTF-8.
Total size of all doc.xml files is 235.5MB.
Total size of archives directory (contains PDF-files
and images) 1.97 GB.
I do not have this problem when building a smaller collection
~300, 400 documents.
- Windows XP SP2 (32bit)
- Pentium IV (HT enabled) 2.80GHz, 1.5GB of RAM
- 5 GB of free space on a SATA 150 HDD.
I can put the archives (trim pdf files to reduce size)
directory online, but I must have
your word that they will not be put public or use for
other purpose then for debug this problem.
indexes Title Language text allfields
levels document section
From: Michael Dewsnip [mailto:firstname.lastname@example.org]
Sent: Thursday, January 11, 2007 2:07 AM
To: Emanuel Dejanu; email@example.com
Subject: Re: [greenstone-devel] MGPP (2.71 build problem)
Hi Emanuel, Jens,
I've tried to reproduce this problem here but have been unable to do so.
How big are your collections, and what type of documents do they contain?
Any unusual encodings? Does the problem go away if you make the collection
All the best,
Emanuel Dejanu wrote:
>After upgrading greenstone from 2.53 to 2.71 I get the following error
>mgpp_passes.exe : Bit buffer overrun
>and after that:
> create the weights file
>mgpp_weights_build.exe : The invf file contains skips. Unable to create
> creating 'on-disk' stemmed dictionary mgpp_invf_dict.exe : Unable
>to open "C:Program FilesGreenstonecollectunhcrbuildingidxunhcr.ii"
>First I was thinking that is a problem about my modification to
>greenstone but I get the same error also when I build with Greenstone
>So there is a problem with the changes that have been done betweeen 2.53
and 2.71 to mgpp.
>I build on windows xp sp2 with active perl 5.8.8.
>Can somebody take a look over my problem.
>greenstone-devel mailing list
__________ NOD32 1971 (20070110) Information __________
This message was checked by NOD32 antivirus system.