Re: [greenstone-users] dates

From Katherine Don
DateFri, 19 Dec 2003 15:19:40 +1300
Subject Re: [greenstone-users] dates
In-Reply-To (886EF25AF8BEF64EB89A820EF84064FF41BF8B-UCMAIL4)
Hi Linda

After much experimenting and puzzling over why your collection wasn't working I realised what was wrong. I'm sorry but I forgot to tell you one vital step to get this working - you need to have an index built on the Coverage metadata to get the searching working.
Add Coverage to the indexes line in the collect.cfg file and rebuild the collection. And it should all work.

sorry about that. hopefully it will be fine now.
regards,
Katherine Don

"Newman, Linda (newmanld)" wrote:

 Katherine -- I hope that we can wrap this up now!  But, I added Coverage metadata to my metadata.xml files, and the date range search still doesn't ever find any dates.I added the Coverage element to my format documentText statement so that I could confirm that Greenstone was picking up the Coverage data, and it is.  I also experimented with having one or both the options ' -extract_date' and '-extract_historical_years' with the HTMLPlug, again with no change in results.Below is an example of a metadata.xml file.  Any ideas?*********************************<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE DirectoryMetadata SYSTEM "http://greenstone.org/dtd/DirectoryMetadata/1.0/DirectoryMetadata.dtd">
<DirectoryMetadata>
 <FileSet>
    <FileName>01000030.jpg</FileName>
    <Description>
      <Metadata name="Language" mode="accumulate">English</Metadata>
      <Metadata name="Title" mode="accumulate">Aanstoos, Theodore A.</Metadata>
      <Metadata name="Date" mode="accumulate">18850124</Metadata>
      <Metadata name="Format" mode="accumulate">image/jpeg</Metadata>
      <Metadata name="Type" mode="accumulate">birth</Metadata>
      <Metadata name="CardNum">632</Metadata>
      <Metadata name="Coverage" mode="accumulate">1885</Metadata>
    </Description>
  </FileSet>
  
  <FileSet>
    <FileName>01000040.jpg</FileName>
    <Description>
      <Metadata name="Language" mode="accumulate">English</Metadata>
      <Metadata name="Title" mode="accumulate">Aarns</Metadata>
      <Metadata name="Date" mode="accumulate">18850803</Metadata>
      <Metadata name="Format" mode="accumulate">image/jpeg</Metadata>
      <Metadata name="Type" mode="accumulate">birth</Metadata>
      <Metadata name="CardNum">4731</Metadata>
      <Metadata name="Coverage" mode="accumulate">1885</Metadata>
    </Description>
  </FileSet>  </DirectoryMetadata> *********************************
-----Original Message-----
From: Katherine Don [mailto:kjdon@cs.waikato.ac.nz]
Sent: Wednesday, December 17, 2003 1:09 PM
To: Newman, Linda (newmanld)
Cc: 'greenstone-users@list.scms.waikato.ac.nz'
Subject: Re: [greenstone-users] dates
 
Hi Linda

Yes, having no textual documents would be a problem. (I never thought to ask what kinds of documents you were working with :-)  ) The extract_historical_years stuff goes through the text and pulls out things that look like dates, eg 1999, 16th century etc.
However, the range searching works on Coverage metadata, so if you add that to your images then it should work. I think Coverage should contain just year info, like 1999, 2000 etc, one year per metadata item, and you can have many Coverage metadata elements per document/image.

regards,
Katherine Don

"Newman, Linda (newmanld)" wrote:

Katherine -- Thank you again for your response!

I had already tried a variation with "-extract_historical_years" as an
option with the HTMLPlug, with no change in results.  However, maybe the
crux of the problem here is that I am not working with documents, but with
images.  For each image (jpgs) I have metadata information that includes a
date field, but there are no documents per se.  The date index and the
datelist classifier are both working from the date field in the metadata
files.  Is it possible to get the date range search to work from the date
field in the metadata files, rather than a date field coming from documents?
Or, can I generate documents from the metadata?