[greenstone-users] Searching problem

From Diego Spano
DateThu Jun 10 01:06:10 2010
Subject [greenstone-users] Searching problem
In-Reply-To (4C0EB2A8-1000607-iway-na)
Hi Renate,

1- If you make a new collection with no partitions and index there only PDFs from 2010, you can search without any problem?.

2- Perhaps the regular expression you use to make the partitions did□n match 2010 files?

Regards

Diego


Diego Spano
Prodigio Consultores
Bernardo de Irigoyen N□ 1114 2□B
Capital Federal - Argentina
Tel: (54 11) 5093-5313
www.prodigioconsultores.com


-----Mensaje original-----
De: greenstone-users-bounces@list.scms.waikato.ac.nz [mailto:greenstone-users-bounces@list.scms.waikato.ac.nz] En nombre de Renate Morgenstern
Enviado el: Martes, 08 de Junio de 2010 06:14 p.m.
Para: undisclosed-recipients:
CC: greenstone-users@list.scms.waikato.ac.nz
Asunto: [greenstone-users] Searching problem

Hi,

We have a newspaper archive in PDF files on Greenstone. To make searching in the extracted text easier, we have partitioned the search indexes in years from 1985 to 2010 (26 partitions). However, only PDF files for 1985-1990, and 2009 and 2010 have been added up to now. The problem is that searching is not working in 2010 files, but all the others gives results. We use Lucene as indexer.
Could there be a limitation on the number of partitions causing this problem. When text is extracted with pdftohtml for the 2010 years, there is no error message that the file could not be processed.
Any idea what the problem could be?
Regards and thanks
Renate

--
Renate Morgenstern
P O Box 30664, WIndhoek, Namibia
Tel/Fax: 242124
Email: rmorgenstern@iway.na