RE: [greenstone-users] RE: Problems rebuilding a collection (schild)

From Tran
DateWed, 23 Mar 2005 17:43:37 +0700
Subject RE: [greenstone-users] RE: Problems rebuilding a collection (schild)
In-Reply-To (4240952E-2000507-cs-waikato-ac-nz)
Hi Michael,
I did not mean to criticize the GS program. If it seems to you in this way
then pls. excuse me. What you have done is great and your staff is very
helpful, and therefore many thanks to all of you for that.

We are preparing a project of a Digital library using GSDL for budget
approving. At this point, only a librarian with experiences in IT is engaged
- it's me. As a result, some of my questions seem naive or criticizing.

What you said in the first point is exactly what we are trying now. We don't
have much problems with small test collections but problems arise when we
try to put 40/50 files into the collections. The problem is understandable:
our files are from many sources and of some formats that are not currently
supported by existing plug-ins.

Regard,

-----Original Message-----
From: Michael Dewsnip [mailto:mdewsnip@cs.waikato.ac.nz]
Sent: Wednesday, March 23, 2005 4:59 AM
To: Tran
Cc: greenstone-users@list.scms.waikato.ac.nz
Subject: Re: [greenstone-users] RE: Problems rebuilding a collection
(schild)

Hi,

There are a lot of issues involved here. We find that the way the GLI
rebuilds collections is rarely a problem -- you just have to understand what
it is doing and remember a few simple tricks:

- Get the design of your collection right *before* you add all your source
documents. It's tempting to just add all the source documents and then play
around with the collection until it is right, but it's much more efficient
to sit down and think about the collection, get it working with just a few
documents, then add the bulk of the documents afterwards.
- Failing that, use the "maxdocs" option when testing changes to the
collection design to rebuild the collection with just a subset of the
documents.
- As John pointed out, changes to format statements do not require
re-importing or re-building. In the GLI, after changing the format
statements go to the Create pane and click Preview Collection to immediately
see the changes.
- If you need finer control over the build process, run the import.pl and
buildcol.pl scripts manually. As well as giving you much more control over
what is done (eg. with buildcol.pl's -mode option), the scripts will run
faster without GLI's overhead. The GLI is just a layer over the Greenstone
scripts -- not a replacement for them. There will always be cases where it
is more efficient to do things outside the GLI (or the only way to do it).

>First at all, as Axel Schild's described: adding new files/removing files
would make GLI imports all old files again. On contrary as it was described
in user and developer manuals, this thing should not happen - just new files
should be processed for saving time and allowing collection users accessing
the old collection during the building time.
>
>
We have started to make the GLI smarter in terms of only importing and
building when necessary (this is done for the GLI applet, where this is
critical to reduce bandwidth usage), but this has to be perfect otherwise it
is worse than useless.

The old collection is always available to users while the collection is
rebuilding, except for a very short period at the end when the old index
directory is deleted and the new building directory is renamed to index.

>Secondly, in my case: I don't add or remove any files from my collection. I
would like to change different options in the tabs Design and Enrich of GLI
for different kinds of output. For example, I would like to hide the icon
linking to the extracted texts as a result I would have to change the Vlist.

>
>
Changing metadata in the Enrich pane or changing items in the Design pane
requires re-importing and re-building. Changing format statements requires
neither.

Regards,

Michael

>-----Original Message-----
>From: John R. McPherson [mailto:jrm21@cs.waikato.ac.nz]
>Sent: Tuesday, March 22, 2005 12:00 PM
>To: Tran
>Cc: greenstone-users@list.scms.waikato.ac.nz
>Subject: Re: [greenstone-users] RE: Problems rebuilding a collection
>(schild)
>
>On Tue, Mar 22, 2005 at 10:23:15AM +0700, Tran wrote:
>
>
>>Hi,
>>I have a similar problem as Axel Schild has described. I want to
>>
>>
>rebuild my
>
>
>>collection after I've made some minor changes related to how GS output
>>
>>
>would
>
>
>>look (not add or change any document files in the collection). It
>>
>>
>seems that
>
>
>>GS re-imports all my document files again and again and it takes a lot
>>
>>
>of
>
>
>>time. In Expert mode I've been trying 3 different modes (build index,
>>compress text and info) without any success.
>>
>>
>
>If you just want to change the appearance (either by changing some of the
format statements in the collection's config file, or by modifying some of
greenstone's macro files), then you do not have to rebuild or reindex the
collection - these changes take effect immediately.
>
>I don't know if you can change these from within the GLI without rebuilding
though - I'm not very familiar with it.
>
>John
>
>--
>No virus found in this incoming message.
>Checked by AVG Anti-Virus.
>Version: 7.0.308 / Virus Database: 266.8.0 - Release Date: 3/21/2005
>
>
>
>

--
No virus found in this incoming message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.8.0 - Release Date: 3/21/2005


--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.8.0 - Release Date: 3/21/2005