|From||Sandton Consulting Ltd|
|Date||Tue Apr 19 01:33:49 2011|
|Subject||[greenstone-users] Re: Problem with Greenstone Preview|
|Hello Anupama,□□ □ □ □ □ □ □ □ □ □ □ □Thanks for the detailed solution. I'll go through the various steps and get back to you on progress of this.
--- On Sat, 4/16/11, Anupama of Greenstone Team <firstname.lastname@example.org> wrote:
> Hello Anupama,□ □ □ □ □ □ □ □ □ □ □ □ Please look at some of the pdf files
1. Two of the three PDFs you sent are processed fine here for me on both Windows and Linux, using both the default setting for PDFPlugin (which uses PDF_to_html) as well as if I have the PDFBox plugin extension turned on.
The 3rd PDF "Computer Science 7.pdf" fails to get converted, and Greenstone displays the underlying problem that PDFBox has encountered:
"Exception in thread "main" java.io.IOException: You do not have permission to extract text"
If you can ask the copyright-holder to remove the permissions on the document and re-save it in PDF, you may be able to convert this last document as well.
You can then also set the verbosity for the output higher in GLI's Create panel > Import Options (to the left) > verbosity field. Set the value to 5. Repeat for the Build Options > verbosity field.
There's not enough information provided as to the manner in which things are going wrong, but there's 2 things I can think of that you could try (please also answer my questions below).
>>>>>> TRY THIS BIT
- If your Linux system is an Ubuntu and the Greenstone version you happen to be using is 2.83 or earlier, then it may be a problem with Perl. In that case, see suggestion 2 below.
1. Otherwise, try the following first. We're going to try building from the command-line, instead of from GLI (the Greenstone Librarian application), just to check whether it has something to do with the environment.
a) Open a linux terminal (x-term) and go into your Greenstone installation folder:
b) Next, set up the Greenstone environment by typing the following in your x-term:
c) Run the import script -- which is the first step of the build process - and provide the name of collection you wish to build as argument to it:
Are there any errors at this stage (check for errors in the text that moves past in the terminal during the execution of the import.pl command)?
d) Next run the 2nd step of the build process, once again providing the collection name as argument:
Once again, does the output show any errors?
e) If all went well, rename the folder "building" inside your collection folder to "index":
! If the above worked for some reason, then the environment GLI runs in when it is launched is different from the environment that the command-line scripts manage to work in.
f) If steps d and e above showed up no errors to do with your PDF files, then go back to your Greenstone installation folder and run the Greenstone server from there to visit your collection page:
a) Download the Greenstone 2.84 installer for *Linux* by clicking the link at the top of http://www.greenstone.org/download
b) Then run the installer to install Greenstone2.84. Make sure to install it somewhere else than your previous Greenstone installation.
c) Next, point your browser to http://trac.greenstone.org/browser/gs2-extensions/pdf-box/trunk/pdf-box-java.tar.gz
d) Use a terminal (x-term) to cd into your Greenstone installation folder and then go into its ext folder where you have saved the tar.gz file downloaded above. Then extract this archive file in this location:
e) Copy your collection folder across from the old Greenstone installation into the new one:
(Note that the folder "collect" is the name of the directory containing your collection which is to be copied. The "collect" folder exists in all normal Greenstone installations, so you need to type it as shown. Just replace the strings inside the <> marks.)
f) In a *fresh* terminal (this is important, so make sure to open a brand new x-term), go back to your Greenstone installation folder and run GLI from here:
The reason you need a fresh x-term is because when GLI is run this time, it will know to set up the Greenstone environment all over again. And this time, it will detect the new PDF Box extension that you downloaded and unpacked in steps c and d.
g) Go to File > Open. Click the "Change Dir..." button at the bottom and, in the dialog that appears, make sure it is pointing to the collect folder inside your new Greenstone 2.84 installation. (Else use the Change Dir dialog to go to the Greenstone 2.84 installation's collect directory.) Now open your collection. This should open the collection you copied into your Greenstone 2.84.
h) Go into the Design panel. Make sure that on the left hand side, "Document Plugins" is selected. Then, to the right, double click on the PDFPlugin in the list of Document plugins. In the Plugin Configuration dialog that appears, scroll down to the section titled "Autoload Converters" and tick the checkbox next to "pdfbox_conversion".
i) Now go to GLI's Create Panel and click the Build button. Hopefully there will be no errors this time and your PDFs will get processed. Then click the Preview button to preview your collection.
Best of luck,
> Sandton Consulting Ltd wrote:
> Hello ,