About this collection

This is a basic image collection that contains no text and no explicit metadata. Several JPEG files are placed in the import directory prior to importing and building the collection, that's all.

The images in this collection have been produced by members of the Department of Computer Science, University of Waikato. The University of Waikato holds copyright. They may be distributed freely, without any restrictions.

How the collection works

Here is a sample document in the collection. The configuration file specifies no indexes, so the search button is suppressed.

There is only one plugin, ImagePlug, aside from the three that are always present (GAPlug, ArcPlug, RecPlug). ImagePlug relies on the existence of two programs from the ImageMagick suite (http://www.imagemagick.org): convert and identify. Greenstone will not be able to build the collection correctly unless ImageMagick is installed on your computer.

ImagePlug automatically creates a thumbnail and generates the following metadata for each image in the collection:

ImageName of file containing the image
ImageWidthWidth of image (in pixels)
ImageHeightHeight of image (in pixels)
Thumb Name of gif file containing thumbnail of image
ThumbWidthWidth of thumbnail image (in pixels)
ThumbHeightHeight of thumbnail image (in pixels)
thumbiconFull pathname specification of thumbnail image
assocfilepathPathname of image directory in the collection's assoc directory

The image is stored as an "associated file" in the assoc subdirectory of the collection's index directory. (Index is where all files necessary to serve the collection are placed, to make it self-contained.) The pathname _httpcollimg_, which is the same as _httpcollection_/index/assoc, refers to this directory. For any document, its thumbnail and image are both in a subdirectory whose filename is given by assocfilepath. The metadata element thumbicon is set to the full pathname specification of the thumbnail image, and can be used in the same way as srcicon (see the MSWord and PDF demonstration collection).

The second format statement in the configuration file, DocumentText, dictates how the document will appear, and this is the result. There is no document text (if there were, it would be producible by [Text]). What is shown is the image itself, along with some metadata extracted from it.

The configuration file specifies one classifier, an AZList based on Image metadata, shown here (Greenstone has suppressed the alphabetic selector because this collection has only a few images). The format statement shows the thumbnail image along with some metadata. (Any other classifiers would have the same format, since this statement does not name the classifier.)

You may wonder why the thumbnail image is generated and stored explicitly, when the same effect would be obtained by using the original image and scaling it:

   <td>[link]<img src='/gsdl/collect/image-e/index/assoc/[assocfilepath]
       /[Image]' width=[ThumbWidth] height=[ThumbHeight]>
       [/link]</td><td valign=middle><i>[Title]</i></td> 

The reason is to save communication bandwidth by not sending large images when small ones would do.

For a more comprehensive image collection, see the kiwi aircraft images in the New Zealand Digital Library. The structure of this collection is quite different, however: it is a collection of web pages that include many images along with the text. The HTML plugin HTMLPlug also processes image files, but it does so in a different way from ImagePlug (for example, it does not produce the metadata described above). In fact, this is one of the few situations where the ordering of plugins in the collection configuration file makes a difference. If both plugins were included, images would be processed by whichever came first in the configuration file.

Another example of a more comprehensive image collection is Gordon Paynter's Pictures of the world. This is like the present collection in that the target documents are images rather than HTML files, but more extensive metadata is associated with each image (using metadata.xml files).

How to find information in the Simple image collection collection

  • browse documents