[greenstone-devel] RE: [greenstone-users] Re: Parallel version of greenstone

From Stephen De Gabrielle
DateTue Mar 22 02:46:04 2011
Subject [greenstone-devel] RE: [greenstone-users] Re: Parallel version of greenstone
In-Reply-To (059A58FF9AA04A308D4B4E9D859201CE-DSC46)
What an interesting idea, this is in the literature, but I'm not aware of
*any* digital library system actually doing this. (unless you count google
et al as a digital library systems)

I believe GS3 is using the same collection building code as GS2, so I don't
think you will have any luck there.

I think I have this right:
- as log as you are letting greenstone use it's hash function to allocate
the unique ID, there is no reason why you couldn't run part of the import on
as many machines as you can, by allocating a subset of documents needing
import to each machine, then merging the resulting folders and files.

Building the indexes would still have to happen on a single machine as far
as I know, but the the build script has the ability to do only specific
phases, so you MAY be able to split up some of the work.

This is all likely to require a fair bit of hands on work, it may be quicker
to pick a quiet machine and let it run all week.

Stephen


On Thu, Feb 17, 2011 at 12:57 PM, Diego Spano <dspano@anac.gov.ar> wrote:

> Hi Thomas, now I understand.
>
> I think that GS 2 is not a parallel application but GS 3 is modular, so you
> can run different processes in different machines. I□m cc this email to GS
> devel-list, perhaps they can help you more.
>
> Regards!
>
> Diego
>
> ------------------------------
> *De:* greenstone-users-bounces@list.scms.waikato.ac.nz [mailto:
> greenstone-users-bounces@list.scms.waikato.ac.nz] *En nombre de *Thomas
> Kebede
> *Enviado el:* martes, 08 de febrero de 2011 11:32
> *Para:* greenstone-users@list.scms.waikato.ac.nz
> *Asunto:* [greenstone-users] Re: Parallel version of greenstone
>
> Dear Diego,
>
> Thank you for your response.
>
> What i wanted to know is if Greenstone has a parallel version, i.e. as you
> know software can be divided into sequential programs (most prevalent) and
> parallel programs (parallel computer programs).
>
> Here we are building our own digital library using Greenstone and the time
> it takes to build the whole collection is so long that a single power
> failure is causing us lots of problems.
>
> A neighboring faculty has a high performance computational cluster and we
> wish to use the power of the cluster to do our job. But for that to happen,
> we must be able to get the parallel and not sequential type of Greenstone.
>
> This is where i need your assistance.
>
> Regards,
> Thomas
>
> On Tue, Feb 8, 2011 at 12:14 AM, Thomas Kebede <thomas.kebede@gmail.com>wrote:
>
>> Hello,
>>
>> Is it possible to get a parallel version of the greenstone software?
>>
>> Thanks
>>
>
>
> _______________________________________________
> greenstone-devel mailing list
> greenstone-devel@list.scms.waikato.ac.nz
> https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-devel
>
>


--

--
Stephen De Gabrielle
stephen.degabrielle@acm.org
Telephone +44 (0)20 85670911
Mobile +44 (0)79 85189045
http://www.degabrielle.name/stephen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://list.scms.waikato.ac.nz/mailman/private/greenstone-devel/attachments/20110321/e309cd77/attachment.html