Re: How to start building collection.

DateThu, 10 Apr 2003 10:16:20 +0930
Subject Re: How to start building collection.

Hi Deepak,

I do this all the time now- but it was hard to get started.

See page 34-36 of the developers guide 'assigning metadata to a file'  (Developers guide version 2.39)

the relevant line in the collect.cfg is

plugin         RecPlug -use_metadata_files

I did have troubles getting it to use a title from the metadata.xml file, instead of extracting one; but I now know how to fix that.

I have included an example metadata.xml file and a portion of my collect.cfg file.

Good luck and happy collection building,






 Subject: How to start building collection.

hi list,
I am building collection for my organization that include

MSword documents, PDF documents, some audio files and image
i want to organize all this in a systmatic manner. I read
Greenstone developer guide throughly. i am not finding the
solution about how to provide meta data about a particular
document [ say title, section, subsection to a pdf document]. can
anyone please guide me a little or
give me some URL address from where i can get adequate help


--metadata.xml- do not include this line--

<?xml version="1.0" ?>
<!DOCTYPE GreenstoneDirectoryMetadata SYSTEM "">
 <Metadata name="Title" mode="accumulate">Portugal, as colonias</Metadata>
 <Metadata name="Creator" mode="accumulate">Vasconcellos, Ernesto de.</Metadata>
 <Metadata name="Year" mode="accumulate">1929</Metadata>
 <Metadata name="Keywords" mode="accumulate">Portugal -- Colonies.</Metadata>
 <Metadata name="Title" mode="accumulate">Estatistica comercial comercio externo, ano de ... ; importacoes e exportacoes</Metadata>
 <Metadata name="Creator" mode="accumulate">Portuguese Timor. Seccao Central de Estatistica.[Corporate Author]</Metadata>
 <Metadata name="Year" mode="accumulate">1939</Metadata>
 <Metadata name="Keywords" mode="accumulate">1. Timor Timur (Indonesia) -- Commerce -- Statistics.</Metadata>

--file ends-do not include this line-

--A PART OF my collect.cfg- do not include this line-

indexes        document:text document:Title document:Keywords document:Creator document:Year document:text,Title,Keywords,Creator
defaultindex   document:text,Title,Keywords,Creator

plugin         ZIPPlug
plugin         GAPlug
plugin         TEXTPlug
plugin         HTMLPlug
plugin         PDFPlug -get_hidden_text
plugin         ArcPlug
plugin         RecPlug -use_metadata_files

classify       AZList -metadata Title
classify       AZList -metadata Creator

collectionmeta collectionname    "East Timor Collection"
collectionmeta iconcollectionsmall /gsdl/collect/etimor/images/etimorsm.gif
collectionmeta iconcollection /gsdl/collect/etimor/images/etimor.gif
collectionmeta collectionextra   "The East Timor Collection is a part of the Arafura Digital Archive (AraDA) project. <p>This collection includes _about:numdocs_ documents and was built _about:builddate_ days ago.<br/>"

collectionmeta .document:text,Title,Keywords,Creator "all"
collectionmeta .document:text    "text"
collectionmeta .document:Title  "titles"
collectionmeta .document:Creator "authors"
collectionmeta .document:Keywords  "keywords"
collectionmeta .document:Year "Year"

format VList  '<td valign=top>[link][srcicon][/link] [link][Title][/link]</td>'
format CL1VList '<td valign=top>[link][srcicon][/link] [link][Title][/link]</td>'
format CL2VList '<td valign=top>[link][srcicon][/link] [link][Creator][/link]</td>'

--snip-end-do not include this line-