[greenstone-users] Missing documents

From John Rose
DateWed, 03 Jan 2007 23:02:35 +0100
Subject [greenstone-users] Missing documents
Hello,

I am trying to build a simple collection with
v.2.72 under Windows XP Home Edition. I have
dragged in 16 documents (pdf, doc and rm).

When I built the collection, I found that two of
the pdf documents were rejected (see below), and
the others seemed to be processed normally. I
believe that searching worked for the processed
documents, but when I tried to display them in
browsing classifiers, those with filenames of
more than 36 characters (but which were handled
without problems by Windows) would not display
(at least with the default VList). When I
shortened the filenames and tried again, I found
that the documents with filenames with French
accented characters would not display with the
browsing classifiers (although they apparently
did display when found by search). When I took
out the accents, all 14 are displayed normally.
Is this a bug or is there a way to get around it?

Concerning the two rejected pdf documents, one
consists only of images, whereas the other seems
to have been prepared in some sort of secure mode
(for example, one cannot cut and paste from
selected text). [Another pdf document consisting
of images, but with an active table of contents,
was processed normally.] Does anyone know how I
could easily get the missed files into the
collection with their metadata, and if possible
with the text of the rejected textual pdf file?

Thanks and best regards,


John B. Rose
1 Bis, Rue des Châtre-Sacs
92310 Sèvres
France
Email: <johnrose@alumni.caltech.edu>