|Date||Fri, 29 Aug 2003 15:58:25 +1200|
|Subject||Re: [greenstone-users] a question|
Yes, that is normal. Fundamentally, Greenstone treats each file as a document. Occasionally, one file is treated as many documents (such as files containing bibliographic records), and sometimes many files are treated as one document (such as an HTML file with some associated images) - but not to the extent you are after. Since HTMLPlug treats each HTML file as a separate document, websites will be split into many documents, and I'm afraid I can't see any easy way of preventing this.
Depending on the effect you are after, there might be other ways of achieving it, but you'll have to tell us a little more about what you're trying to do.
"Héctor Aracena M." wrote:
Dear friends: I've installed Greenstone Digital Library in web mode in a computer with Windows XP with an Apache server. I'm having troubles when I try to import a website into a collection I've created: GSDL don't import the website as a whole document but many different ones, considering every HTML file (sections and subsections of the website) as a separate document. ¿Is that normal? Maybe I'm missing something but I can't find anything about it in the user's guides. Any idea or tip will be very appreciated. Thanks.________________________________________