Re: [greenstone-devel] plugin perl problems

From Michael Dewsnip
DateThu, 20 Jan 2005 17:15:20 +1300
Subject Re: [greenstone-devel] plugin perl problems
In-Reply-To (OFBC9E5621-57F6685D-ON69256F87-007F0F41-69256F87-00804D27-nt-gov-au)
Hi Stephen,

Katherine described how the "go to page" functionality works in her
reply to Jenn Cole's posting yesterday (let me know if you missed it and
I'll forward it to you).

In terms of multi-document plugins, you'll notice that these usually
inherit from SplitPlug. SplitPlug deals with creating Greenstone
documents for each of the segments in the file, and each segment is
given the same base OID with a suffix designating the segment number.
Each of these document also has extra "SourceSegment" metadata
indicating the segment number. If you're interested in the gory details,
have a look at SplitPlug's read function.

All the best,

Michael

Stephen.DeGabrielle@nt.gov.au wrote:

>
> Hi,
>
> thinking about this last night I have decided my problem is I don't
> understand how
> the multi-part/page documents get generated, the same difficulty goes
> for multidocument plugins like MARCPlug or OAIPlug.
>
> Any help understanding what is going on with these sort of plugins is
> appreciated.
>
> Thanks
>
> s.
>
> --
>
> Have a look at the following snippet:
> (starts somewhere near line 100 of PDFPlug.pm)
> ####
> # following title_sub removes "Page 1" added by pdftohtml, and a
> leading
> # "1", which is often the page number at the top of the page. Bad
> Luck
> # if your document title actually starts with "1 " - is there a
> better way?
>
> #my $self = new ConvertToPlug ($class, @args, "-title_sub",
> '^(Pages+d+)?(s*1s+)?');
> my $self = new ConvertToPlug ($class, @args);
> $self->{'plugin_type'} = "PDFPlug";
> if ($use_sections) {
> $self = new ConvertToPlug ($class, @args, "-title_sub",
> '^(Pages+d+)?(s*1s+)?');
> $self->{'use_sections'}=1;
> }
> ####
>
> It is part of 'sub new' in PDFPlug.pm, and is my attempt to fix
> PDFPlug so it doesn't overide a
> title_sub specified in the arguments of PDFPlug.pm with the
> '^(Pages+d+)?(s*1s+)?' arguments
> required to remove the "Page 1" added by pdftohtml (as noted in the
> comments)
>
> I thought I got it right - but attempts to rebuild with a small
> collection seem to have killed my ability to generate sections at all
> (my test document included the greenstone developers guide).
>
> Any help/suggestions appreciated
>
> regards,
>
> s.
>
> ----------
> Stephen De Gabrielle
> 8922 0887
>
> http://www.birdguides.com/html/vidlib/species/Carduelis_cannabina.htm
>
>------------------------------------------------------------------------
>
>_______________________________________________
>greenstone-devel mailing list
>greenstone-devel@list.scms.waikato.ac.nz
>https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-devel
>
>