[greenstone-users] Error 500 in gliserver.pl in relation to one large collection

From Greenstone Team
DateThu Jul 28 14:46:49 2011
Subject [greenstone-users] Error 500 in gliserver.pl in relation to one large collection
In-Reply-To (CAP7FF=fq0Q0K0LYLck-Qpq+gFA89TubKS9zhNScq+45UuSsaNQ-mail-gmail-com)
Hi Sean,

Sam here suspects that it could be due to too many file handles being
held open. It may be a known issue with GLI that it can't handle
collections that are too large.
Can you try to access the server machine and rebuild the collection from
the command line on there (see below for instructions)? Does that work?

Instructions on rebuilding the collection from the command-line. Read
the following through the end first, before trying it out.

1. Try to ssh into the remote machine where the GS server lives, or
otherwise try to gain direct access to the machine.

2. Stop your Greenstone 2 web server.

3. Open up a terminal (like an x-term on Linux, DOS prompt on Windows)
and cd into your Greenstone installation folder. Note that the ">" angle
bracket represents a new line of your waiting command prompt (don't type
> cd "C:Program FilesGreenstone2"

4. Next, run the setup script to setup Greenstone's environment.
On Windows:
> setup.bat

On Linux:
> source setup.bash

5. First, decide on whether you want to try incremental building in an
attempt to save some time, or whether you think your collection may have
become corrupted and you require a proper rebuild. Your collection is
very huge, and so time-saving measures are something to consider:

(i) If you want to try incremental building, then after each ".pl"
below, type the word "-incremental" (without quotes) before or after the
word "-keepold" already in ALL the commands in step 6 below. Make sure
to put a space or more between -incremental and -keepold.
(ii) If you suspect your index folder is corrupted and incremental
building can't fix the fundamental flaws, but you hopefully anticipate
that your archives folder may have survived intact, just leave the
"-keepold" flag in (don't add in "-incremental"). No need to change any
of the commands in step 6.
(iii) If you think even your collection's archives folder and not just
its index folder may have become corrupted, *replace* the "-keepold"
flag in ALL the commands in step 6 below with "-removeold". (But again,
don't add any "-incremental").

I would go with (iii), but only AFTER moving your collection's current
"index" and "archives" folders out of the way, to keep some sort of
backup of them. (Your collection's "index" and "archives" folders are
located in your GS2'installations collect/<collection name> directory).
Moving them elsewhere will also tell if your OS is holding a lock on any
index files, since Windows often does this and that can break the
building process.
Option (iii) may take the longest but at least you'd have tried it all
in one go.

6. Now, you are ready to start the 3 step manual collection building


On Windows:
> perl -S import.pl -keepold <type your collection's name here>

On Linux:
> import.pl -keepold <type your collection's name here>
(If that didn't work, plug the word "perl" in front of the Linux command).

NOTE: If your collect folder is located elsewhere, add in the
-collectdir flag to the command and provide the full path to your
non-standard "collect" directory as follows:
> perl -S import.pl -collectdir "full/path/to/your/external/collect"
-keepold <type your collection's name here>
Or on Linux:
> import.pl -collectdir "full/path/to/your/external/collect" -keepold
<type your collection's name here>

It's likely the above will spend a long time trying to import your 14Gb
worth of documents. Once that's done at last, the prompt will return to
you. At which stage you need to perform the next stage:


On Windows:
> perl -S buildcol.pl -keepold <type your collection's name here>

On Linux:
> buildcol.pl -keepold <type your collection's name here>
(If that didn't work, plug the word "perl" in front of the Linux command)

Once again, if your collect directory is different from the standard GS2
"collect" folder, additionally specify the -collectdir <"full path to
your collect folder"> option to the buildcol command.

It may take a very long time again to build your collection. But if it
succeeds, you can move onto the 3rd stage of the rebuilding process:


Rebuilding manually from the command-line generates a folder called
"building" inside your collect/<collection-name folder>. If you see any
folder called "index" in here, then move it far out of the way (or
delete it, if you feel confident). Then rename "building" to "index".
While GLI does this step automatically for you, manual rebuilding does not.

7. If you saw no errors during any stage of the rebuilding process of
step 6, it's a fair indication that things were okay. But to make fully
sure, restart your GS2 web server and visit its home page and then go to
your rebuilt collection and see if it still works.

Write back if you encounter any error messages during step 6 or anything
that goes visibly wrong in step 7 (or any of the steps).

All the best,

Sean Mitchuson wrote:
> We have been working on a collection that is around 14gb worth of data
> and is mostly pdf files. Recently after a upload session we can no
> longer access the collection. Every time we try to open it through
> the GLI it sits and waits for minutes (up to 20 at last check) and
> then gives us a 500 error for gliserver.pl <http://gliserver.pl>
> Is this collection ruined? Or is there a way to save it?
> Thanks,
> --
> Sean Mitchuson
> Library Tech Coordinator
> Murray State University
> Murray, Ky
> Phone: 270.809.4773
> ------------------------------------------------------------------------
> _______________________________________________
> greenstone-users mailing list
> greenstone-users@list.scms.waikato.ac.nz
> https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users