How to create Image collections

From Gordon Paynter
DateThu, 23 May 2002 10:28:22 -0700
Subject How to create Image collections
Hi all, here's an explanation of how I created my image collections.

Overview:

I use ImagePlug, RecPlug and metadata.xml files to generate image
collections (called "album" and "pictures") from photographs I have
taken with my digital camera.

ImagePlug imports the pictures and makes thumbnail images and
"screenview" images automatically from each image; it also calculates
"implicit" metadata (e.g. Image Size, Image type, Thumbnail size &
type, and so on).

The "explict" metadata (including Title, Photographer, Copyright,
Date, People) is provided in metadata.xml files, which I create with a
hacked-together perl CGI program (simple and dangerous--it requires
write access to the import directory, and can mess up your data). I
can distribute this if anyone needs it, but it is Linux-only, and I
suspect there are better metadata editors out there.


How I made the album collection:

I created the collections by adding new materials in a structured tree
in the import directory, and adding metadata recursively in
metadata.xml files. (Of course, I turn on the RecPlug
-use_metadata_files option.)

Basically, the toplevel import directory has a metadata file with a
FileSet element associating default metadata with every file in that
directory (and every subdirectory recursively): these include
Photographer, Copyright holder, and Title, as follows:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE GreenstoneDirectoryMetadata SYSTEM
"http://greenstone.org/dtd/GreenstoneDirectoryMetadata/1.0/GreenstoneDirectoryMetadata.dtd">
<DirectoryMetadata>
<FileSet>
<FileName>.*</FileName>
<Description>
<Metadata name="Date" >11111111</Metadata>
<Metadata name="Place" >Unknown</Metadata>
<Metadata name="Photographer" >Gordon Paynter</Metadata>
<Metadata name="Rights" >Copyright 1995-2002 Gordon
Paynter</Metadata>
<Metadata name="Title" >Untitled Photograph</Metadata>
</Description>
</FileSet>
</DirectoryMetadata>

Photographs are stored in subdirectories of import. In the
subdirectories, I add metadata (overwriting the defaults) as
appropriate. I also add "People" metadata in "accumulate" mode so I
have a record of who is in each picture, and I make sure that every
image has "Date" and "Place" metadata (as much as I like Dublin Core,
"Place" is easier to type than "Coverage.Location").

Generally, I have all the files about a particular topic, place or day
in a new directory tree, and I take advantage of recursion and regular
expressions to minimise the amount of metadata I have to add.


The collection configuration file:

I've included my collection configuration file below. In these
collections, searching isn't much good because there isn't much text
available to search, so the classifiers are the main access point.

Note 1: One of the worst things about the collect.cfg is the format
strings: they are unreadable because they have to be on
exactly one line. I have a bit of a hack for getting around
this, which I will explain in another email (that's why
there's a "DO NOT EDIT BEYOND THIS POINT" line in there; you
can delete that line and edit it if you like).

Note 2: The block_exp parameter to ImagePlug is used to skip over
backup files made by my metadata editor. UnknownPlug is used
to import Quicktime movies (but they are not displayed
properly in the formatstrings yet, however).

Note 3: I have recently made a few changes to ImagePlug so that it
always gets Image dimensions & type correct (the old version
had trouble because different versions of ImageMagick were
outputting different messages). I can't remember if I've
checked it in to Greenstone's CVS; I'll make sure I do it
before the next Greenstone release.

Note 4: This is the collect.cfg for my private "album" collection.
The file for my public "pictures" collection is less complex
(it has no "People" metadata). (Long story.)


----- begin collect.cfg -----
creator gordon.paynter@ucr.edu
maintainer gordon.paynter@ucr.edu
public false

indexes document:Title document:Title,Place document:Place
defaultindex document:Title,Place

plugin GAPlug
plugin ImagePlug -screenviewsize 512 -block_exp '.*.backup'
plugin UnknownPlug -process_exp '.MOV'
-assoc_field 'movie' -file_type 'misc/misc'
plugin ArcPlug
plugin RecPlug -use_metadata_files


classify AZList -metadata Title
classify AZCompactList -metadata People -mingroup 1
classify Hierarchy -hfile Place.txt -metadata Place
classify DateList -bymonth

collectionmeta collectionname "Gordon's Digital Photo Album"
collectionmeta iconcollection
"/gsdl/collect/album/logo/album.jpg"
collectionmeta collectionextra "Photographs by Gordon Paynter"

collectionmeta .document:Title "titles"
collectionmeta .document:Source "filenames"
collectionmeta .document:Place "places"
collectionmeta .document:Subject "subjects"
collectionmeta .document:Title,Place "titles or places"

format DocumentButtons ""
format DocumentHeading ""

#### DO NOT EDIT BEYOND THIS POINT ####

format SearchVList '<td valign="top" align="center"> {If}{[Image],
[link]<img
src="/gsdl/collect/album/index/assoc/[assocfilepath]/[Thumb]">[/link],
[link][icon][/link] } </td> <td> <font face="Arial">
<b>[link][Title][/link]</b> {If}{[Image], <br /><a
href="/gsdlmod?c=gsarch&a=q&h=dtp&q=[Place]">[Place]</a>, [Date] <br
/>[ImageWidth]x[ImageHeight] pixels } {If}{[movie], <br />Quicktime
movie! <br />[movie] } </font> </td> <td align="right"> {If}{[Date],
<font face="Arial"><b>[Date]</b></font>, &nbsp; } </td>'

format CL2VList '<td valign="top" align="center"> {If}{[Image],
[link]<img
src="/gsdl/collect/album/index/assoc/[assocfilepath]/[Thumb]">[/link],
[link][icon][/link] } </td> <td> <font face="Arial">
<b>[link][Title][/link]</b> {If}{[Image], <br /><a
href="/gsdlmod?c=gsarch&a=q&h=dtp&q=[Place]">[Place]</a>, [Date] <br
/>[ImageWidth]x[ImageHeight] pixels } {If}{[Movie], <br />Quicktime
Movie } </font> </td> {If}{[Date], <td align="right"> <font
face="Arial"> <b>[Date]</b> </font> </td> }'

format DocumentText '<font face="Arial"> <table width="100%"
border="0"> <tr> <td align="center"> {If}{[movie], <b>Movie:</b>
/gsdl/collect/album/index/assoc/[assocfilepath]/[movie] }
{If}{[Image], <table border="0"> <tr> <td align="center" colspan="2"
background="/gsdl/collect/album/logo/top.gif"> <b>[Title]</b>
</td> </tr> <tr> <td width="120" align="right">Photographer:</td>
<td>[Photographer]</td> </tr> <tr> <td align="right">&nbsp;</td> <td
align="left">[Rights]</td> </tr> <tr> <td align="right">Location:</td>
<td><a href="/gsdlmod?c=gsarch&a=q&h=dtp&q=[Place]">[Place]</a></td>
</tr> <tr> <td align="right">Date:</td> <td>[Date]</td> </tr> <tr> <td
align="right">Size:</td> <td>[ImageWidth]x[ImageHeight] pixels</td>
</tr> <tr> <td colspan="2" align="center"> <img
src="/gsdl/collect/album/index/assoc/[assocfilepath]/[Screen]"
width="[ScreenWidth]" height="[ScreenHeight]" /> </td> </tr> <tr> <td
colspan="2" align="center"
background="/gsdl/collect/album/logo/bottom.gif"> <font
face="Arial" color="black"> <b> <a
href="/gsdl/collect/album/index/assoc/[assocfilepath]/[Thumb]">Thumbnail</a>
&nbsp;&nbsp;<a
href="/gsdl/collect/album/index/assoc/[assocfilepath]/[Screen]">Small
Image</a> &nbsp;&nbsp;<a
href="/gsdl/collect/album/index/assoc/[assocfilepath]/[Image]">Full
Image</a> </b> </font> </td> </tr> </table> } </td> </tr> </table>
</font>'

format CL4DateList '<td valign="top" align="center"> {If}{[Image],
[link]<img
src="/gsdl/collect/album/index/assoc/[assocfilepath]/[Thumb]">[/link],
[link][icon][/link] } </td> <td> <font face="Arial">
[link][Title][/link] {If}{[Image], <br />[Place], [Date]<br
/>[ImageWidth]x[ImageHeight] pixels } {If}{[Movie], <br />Quicktime
Movie } </font> </td> {If}{[Date], <td valign="top" align="right">
<font face="Arial"><b>[Date]</b></font> </td> }'

format CL3VList '<td valign="top" align="center"> {If}{[Image],
[link]<img
src="/gsdl/collect/album/index/assoc/[assocfilepath]/[Thumb]">[/link],
[link][icon][/link] } </td> <td> <font face="Arial">
<b>[link][Title][/link]</b> {If}{[Image], <br /><a
href="/gsdlmod?c=gsarch&a=q&h=dtp&q=[Place]">[Place]</a>, [Date] <br
/>[ImageWidth]x[ImageHeight] pixels } {If}{[Movie], <br />Quicktime
Movie } </font> </td> {If}{[Date], <td align="right"> <font
face="Arial"> <b>[Date]</b> </font> </td> }'

format CL1VList '<td valign="top" align="center"> [link]{If}{[Image],
<img
src="/gsdl/collect/album/index/assoc/[assocfilepath]/[Thumb]">,
movie }[/link] </td> <td> <font face="Arial">
<b>[link][Title][/link]</b> {If}{[Image], <br /><a
href="/gsdlmod?c=gsarch&a=q&h=dtp&q=[Place]">[Place]</a>, [Date] <br
/>[ImageWidth]x[ImageHeight] pixels } {If}{[movie], <br />Quicktime
movie!<br />[movie] } </font> </td> <td align="right"> {If}{[Date],
<font face="Arial"><b>[Date]</b></font>, &nbsp; } </td>'

----- end collect.cfg -----

This is what the Place.txt file (used to create the hierarchical
classifier based on Place metadata) looks like:

----- begin Place.txt -----
"New Zealand" 1 "New Zealand"
"Auckland" 1.1 "Auckland"
"Waikato" 1.2 "Waikato"
"Cambridge" 1.2.1 "Cambridge"
"Hamilton" 1.2.2 "Hamilton"
"Bay of Plenty" 1.3 "Bay of Plenty"
"Opotiki" 1.3.1 "Opotiki"
"Rotorua" 1.3.2 "Rotorua"
"Taneatua" 1.3.3 "Taneatua"
"Waimana" 1.3.3 "Waimana"
"East Cape" 1.4 "East Cape"
"Tologa Bay" 1.4.1 "Tologa Bay"
"Hicks Bay" 1.4.2 "Hicks Bay"
"Te Araroa" 1.4.3 "Te Araroa"
"Whanarua Bay" 1.4.4 "Whanarua Bay"
"Motu River" 1.4.5 "Motu River"
"Maraenui" 1.4.6 "Maraenui"
"Torere" 1.4.7 "Torere"
"Opape" 1.4.8 "Opape"
"Hawke Bay" 1.5 "Hawke Bay"
"Napier" 1.5.1 "Napier"
"Hastings" 1.5.2 "Hastings"
"Wairoa" 1.5.3 "Wairoa"
"Mahia Peninsula" 1.5.4 "Mahia Peninsula"
"Taupo" 1.6 "Taupo"
"Otago" 1.7 "Otago"
"Otago Peninsula" 1.7.1 "Otago Peninsula"
"Dunedin" 1.7.2 "Dunedin"
"Southland" 1.8 "Southland"
"The Catlins" 1.8.1 "The Catlins"
"Invercargil" 1.8.2 "Invercargil"
"USA" 2 "United States of America"
"Big Bear Lake" 2.1 "Big Bear Lake, CA"
"Forest Falls" 2.2 "Forest Falls, CA"
"Riverside" 2.3 "Riverside, CA"
"San Francisco" 2.4 "San Francisco, CA"
"San Marino" 2.5 "San Marino, CA"
"Roanoke" 2.6 "Roanoke, VA"
"England" 3 "England"
"Bath" 3.1 "Bath"
"Burgess Hill" 3.2 "Burgess Hill"
"London" 3.3 "London"
----- end Place.txt -----


hth.
Gordon.