Yes, the short answer is that dc:identifier (or identifier) should
contain a URL.
After looking at the code, the long answer appears to be:
We check the identifier for a URL (starts with http, https, ftp). If it
is one, then:
There is an option - include filetype - which defaults to doc,pdf,ppt
We check the file extension to see if it matches one of these. If so, we
If there is no file extension, or if the file extension is html, then we
download the page and scan though it looking for hrefs that match the
specified file extensions, and download those.
Note that I haven't tried the second case. Hopefully it works...
Brad Rhoads wrote:
> Hi Everyone,
> I'm just getting started with a digital library project. I'm
> developing a system focused on metadata entry, including user security
> and work flow. The intention is to use this system to create
> Greenstone collections. If anyone's interested, a very rough prototype
> is at http://flashlit.serveall.net/flashlit. Now that I've introduced
> myself a bit, here's my question.
> I was excited to see that in the Greenstone OAI import there's a get
> document checkbox. The OAI-PMH standard only addresses metadata, not
> the transfer of the actual document. So my questions is what is are
> the requirments for Greenstone to be able to download a document? My
> guess is that it needs a URL to the document in the dc.identifier
> field. Is that correct? Any other details?
> Brad Rhoads
> MAF Learning Technologies
> greenstone-users mailing list