[greenstone-users] Unicode and dead links

From Jim Elmborg
DateSat, 20 Mar 2004 09:12:27 -0600
Subject [greenstone-users] Unicode and dead links
Hello All,
I'm relatively new to Greenstone and have built several libraries with
success. I'm now doing a more complex (and interesting) project and am
encountering a couple of problems I'm having trouble resolving.

I have folder full of web pages I'm trying to make into a Greenstone
library. The pages contain information about international writers and so
many languages are represented on the pages with all the attendant
diacritics . The pages are created with a generic template. Each has some
text, a photograph, and a link to a PDF file. HTML files are in the main
folder along with one folder that contains all the images and one folder
that contains the pdf's. All links to PDFs and images are relative.

I've built the collection multiple times on two machines (one mac os x and
one redhat linux 9). I've used both the GLI interface and the web
collector. In every case the pages import fine and the library builds
without errors. When I go to the view the collection, I have two problems:

1. None of the links to the PDF files work. The PDF files were processed
and show up as browseable and searchable in the collection, but all the
links to them from the HTML pages are broken and retrieve the standard
"Internal Link Missing" message. Am I right to assume that if the links
work in the original HTML, they should work in the final Greenstone library?
Am I right to assume links to PDF files should be retained in the final
library? If I am right, can anyone point me to my problem?

2. Diacritics display correctly in the original web pages, they are created
using the standard HTML code for producing umlauts, accent marks, etc..
Once processed by Greenstone, the special diacritics are all missing from
the pages along with any characters attached to them. I'm not very
experienced with Unicode and may need to learn more to make best use of

If someone could help troubleshoot these problems and/or point me to useful
resources for using Unicode in Greenstone, I'd be most appreciative.

All best,
Jim Elmborg
Assistant Professor
School of Library and Information Science
3070 Main Library
The University of Iowa 52242-1420