[greenstone-devel] plugin perl problems

From Stephen.DeGabrielle@nt.gov.au
DateThu, 13 Jan 2005 08:52:51 +0930
Subject [greenstone-devel] plugin perl problems


thinking about this last night I have decided my problem is I don't understand how
the multi-part/page documents get generated, the same difficulty goes for multidocument plugins like MARCPlug  or OAIPlug.

Any help understanding what is going on with these sort of plugins is appreciated.




Have a look at the following snippet:
(starts somewhere near line 100 of PDFPlug.pm)
    # following title_sub removes "Page 1" added by pdftohtml, and a leading
    # "1", which is often the page number at the top of the page. Bad Luck
    # if your document title actually starts with "1 " - is there a better way?

    #my $self = new ConvertToPlug ($class, @args, "-title_sub", '^(Pages+d+)?(s*1s+)?');
    my $self = new ConvertToPlug ($class, @args);
    $self->{'plugin_type'} = "PDFPlug";
    if ($use_sections) {
            $self = new ConvertToPlug ($class, @args, "-title_sub", '^(Pages+d+)?(s*1s+)?');

It is part of 'sub new' in PDFPlug.pm, and is my attempt to fix PDFPlug so it doesn't overide a
title_sub specified in the arguments of PDFPlug.pm with the '^(Pages+d+)?(s*1s+)?' arguments
required to remove the "Page 1" added by pdftohtml (as noted in the comments)

I thought I got it right - but attempts to rebuild with a small collection seem to have killed my ability to generate sections at all (my test document included the greenstone developers guide).

Any help/suggestions appreciated



Stephen De Gabrielle
8922 0887