Re: Re: Re: [greenstone-users] Missing documents

From John Rose
DateThu, 04 Jan 2007 20:48:52 +0100
Subject Re: Re: Re: [greenstone-users] Missing documents

I have found another problem with my sample
collection. For Real Media files which are either
gathered into subdirectories, or which have
spaces in the filenames, GLI seems to mix up the
filename (the extracted filename is a
concatenation of the path and filename, but
without backslash separators, and with spaces
represented as %20) so that when one clicks on
the icon for the file in the browsing display
there is a message that the file cannot be found.
This problem does not occur with pdf or Word
files. If I place my rm files in the main
directory for the collection, the problem goes
away. Any comments? Thanks and best regards, John

>Date: Thu, 04 Jan 2007 09:59:49 +0100
>From: John Rose <>
>Subject: Re: Re: [greenstone-users] Missing documents
>Dear Shaoqun,
>All of the long filenames had also accented
>French characters (à, é, è, ù, ç), and when I
>shortened the names, some ended up without
>accented characters and these were displayed.
>Thus I think it is a problem with accented
>characters rather than with long filenames. I am
>sending separately one of the files with
>accented characters in the filename, as well as
>the collect.cfg file. When I browse on title,
>for example, only the documents without accented
>characters in the filename appear in the
>document list. I am wondering whether it is a
>problem of using Greenstone in English on a
>French version of Windows (there were similar
>problems with an earlier version of Greenstone -
>I believe with the installation procedure, but I
>don't remember exactly - which the Greenstone team fixed. Best regards, John
>At 03:30 04/01/2007, you wrote:
>>Message: 8
>>Date: Thu, 4 Jan 2007 15:28:18 +1300 (NZDT)
>>Subject: Re: [greenstone-users] Missing documents
>>Cc:, John Rose
>> <>
>> <>
>>Content-Type: text/plain;charset=iso-8859-1
>> >> When I built the collection, I found that two of
>> >> the pdf documents were rejected (see below), and
>> >> the others seemed to be processed normally. I
>> >> believe that searching worked for the processed
>> >> documents, but when I tried to display them in
>> >> browsing classifiers, those with filenames of
>> >> more than 36 characters (but which were handled
>> >> without problems by Windows) would not display
>> >> (at least with the default VList). When I
>> >> shortened the filenames and tried again, I found
>> >> that the documents with filenames with French
>> >> accented characters would not display with the
>> >> browsing classifiers (although they apparently
>> >> did display when found by search). When I took
>> >> out the accents, all 14 are displayed normally.
>> >> Is this a bug or is there a way to get around it?
>>I tried it on our windows machine using 2.72 version--making the filename
>>longer than 36 chars and with french accented chars, and it seemed work
>>fine, so could you send me one of your such files (if possible)?
>>greenstone-users mailing list
>>End of greenstone-users Digest, Vol 46, Issue 4
> John B. Rose
> 1 Bis, Rue des Châtre-Sacs
> 92310 Sèvres
> France
> Email: <>

John B. Rose
1 Bis, Rue des Châtre-Sacs
92310 Sèvres
Email: <>