Re: [greenstone-users] Extracted Greenstone Metadata from Adobe

From Katherine Don
DateThu, 20 Oct 2005 12:46:20 +1300
Subject Re: [greenstone-users] Extracted Greenstone Metadata from Adobe
In-Reply-To (200510141339-17032-robleyd-ozemail-com-au)
Hi David

> I was about to pose the same question; I can say that it works for me now.

> My PDF documents have comma separated lists of both author and keyword and
> as a result all the keywords, or author listings for any particular
> document are grouped together in the listing. Is it possible to explode
> the comma separated list to provide separate keyword and author listings?
We don't do this at the moment. If you know Perl you could add it in.
Otherwise we'll add it to our TODO list.

> I guess the way author names are stored would need to be revised our end,
> as currently we use eg "Alan Ralph, John Winston Toumbourou, Morgen
> Grigg, Rhiannon Mulcahy, Michael Carr-Gregg and Matthew R. Sanders".
> Presumably this would need to be like "Ralph Alan, Toumbourou John
> Winston, ..." ??

You should put the author names the way you want them to be displayed.
The default sorting can handle both 'John Smith' and 'Smith, John'
formats (as long as the documents are in english, or at least aren't
recognised to be not english). The second format obviously wouldn't be
good if you were exploding a comma separated list.

> Another request that has been made to me is to be able to list all the
> documents by the "issue" of the journal that they appear in. The journal
> has issues like 'Vol 1 Issue 1' 'Vol 1 Issue 2' etc with a number of
> articles in each; each article is a separate PDF doc. My first thought
> would be to put the issue in the Title property of the PDF doc, and the
> actual document title in the subject; as I understand it I should then be
> able to use the ex.Title to group by issue number, and still use
> ex.Subject to create a title group.
Personally, I would put the Title of the PDF in Title, and Volume and
Issue numbers in separate fields. This is easy to do if you are using
GLI to add metadata - I guess you may be more restricted if you are
using Adbobe fields?
If you had say Title, Volume and Issue metadata, you could do a browsing
hierarchy using GenericList, with -metadata Volume/Issue/Title

Hope this helps,