[greenstone-users] associate_tail_re

From Katherine Don
DateMon Apr 12 14:11:55 2010
Subject [greenstone-users] associate_tail_re
In-Reply-To (4BC25ABA-6040603-cs-waikato-ac-nz)
Hi Michael

I have gotten it to work with a bit of fiddling. The trick is you have
to create a process expression that prevents all the secondary files
from being processed.
I had three files:
File.pdf FileTbones.pdf FileTrpts.pdf
And the options I used were:
plugin PDFPlugin -associate_tail_re (Trpts|Tbones).pdf
-process_exp ^((?!Trpts|Tbones).)*$

I think if you add in Sax and Rhythm to these, ie
plugin PDFPlugin -associate_tail_re
(Trpts|Tbones|Sax|Rhythm).pdf -process_exp
^((?!Trpts|Tbones|Sax|Rhythm).)*$
then it should work for you.

Let me know how you get on.
Regards,
Katherine

Katherine Don wrote:
> Hi
>
> The trac ticket below talked about adding multiple files of the same
> file type, but not the same file type as the main file. eg associate 3
> pdfs with a word doc, but not 3 word docs with another word doc.
>
> I don't think it will be too difficult to implement without manually
> adding paths, but it may need a custom plugin rather than using
> associate_tail_re. I'll have a look and see.
> Katherine
>
>
> Michael Silver wrote:
>> Hi Katherine,
>>
>> Thank you for your reply - I really appreciate the effort you put
>> into supporting all of us!
>>
>> I was wondering how it would deal with that. I was going from a
>> couple of posts on the mailing list and in trac (e.g.,
>> http://trac.greenstone.org/ticket/113). Maybe I'm reading it wrong,
>> but I interpreted as meaning associate_tail_re was supposed to work
>> this way.
>>
>> I guess the other way to do it would be to add a metadata element
>> that works like ex.equivlink but which can be edited. The big problem
>> with that is setting it up to find the associated items. In some
>> ways, that would be a better solution - a subcollection could be
>> created for each instrument type, but pulling the assocfilepath for
>> each file would be a manual entry. At least, that's as far as I've
>> gotten on it.
>>
>> For right now, I'll get it up and running without this feature. This
>> is for a class project, but I'm hoping to continue it as a real
>> project to document performances, original music, photos, etc. for a
>> community jazz band in which I play.
>>
>> Thanks again for your help!
>>
>> Michael
>>
>> On 11/04/2010 4:12 PM, Katherine Don wrote:
>>> Hi Michael
>>>
>>> I am not sure if we have used this to associate files with the same
>>> extension. It was designed for eg processing a doc file and
>>> associating a pdf/image etc.
>>>
>>> I'll have to try it out and see if its easy to get it to work with
>>> files with same extension. One tricky point will be telling
>>> greenstone which file to process. In your case, it will probably
>>> process both doc files as individual files. will probably need to
>>> use process_exp too.
>>>
>>> One other thing to note, associate_ext was designed to associate
>>> different file types of the same content. The main file gets
>>> processed (text and metadata extracted) while the other files are
>>> not processed in any way other than to link them to the main one.
>>> If you have text in two of the files that you want extracted (and
>>> then indexed) then associate_ext is not the way to go.
>>> I guess with musical scores you are not looking to extract text?
>>>
>>> Regards,
>>> Katherine
>>>
>>> Michael Silver wrote:
>>>> Hello,
>>>>
>>>> I am trying to get associate_tail_re to work with a collection I'm
>>>> building using Greenstone 2.83. I'm hoping to be able to associate
>>>> sets of files into something like the compound item used in
>>>> CONTENTdm. For ease of testing, I'm using four files,
>>>>
>>>> Title.doc
>>>> Title-part2.doc
>>>> Title-part2.txt
>>>>
>>>> If I use associate_ext, I can successfully associate Title.doc with
>>>> Title.pdf and Title-part2.txt. What I want is to associate
>>>> Title.doc with Title-part2.doc.
>>>>
>>>> If I enter -part2.doc into associate_tail_re, no association is
>>>> created. If I enter -part2..* the PDF and TXT files are
>>>> associated. I've tried various combinations of regular expressions
>>>> and filenames, but I have not been able to get the result I'm
>>>> looking for.
>>>>
>>>> According to what I've read in tickets in trac.greenstone.org and
>>>> in the mailing list archives, associate_tail_re should be able to
>>>> do this, but I've been unable to locate any details or examples.
>>>> Can anyone point me in the right direction, or provide some examples?
>>>>
>>>> Any suggestions or pointers are welcome. I've tried RTFM, but I'm
>>>> obviously looking in the wrong manuals!:-)Thank you!
>>>>
>>>> Michael
>>>>
>>>> P.S. If you're interested, the actual case (instead of testing) is
>>>> to combine musical scores with instrumental parts, e.g.,
>>>>
>>>> SongTitle.pdf (score)
>>>> SongTitleTrpts.pdf (trumpet parts)
>>>> SongTitleTbones.pdf (trombones)
>>>> SongTitleSax.pdf (saxes)
>>>> SongTitleRhythm.pdf (rhythm section)
>>>>
>>>> If I could get past the above problem, I should be able to create a
>>>> regex to match (Trpts|Tbones|Sax|Rhythm).pdf. I hope.
>>>>
>>>> _______________________________________________
>>>> greenstone-users mailing list
>>>> greenstone-users@list.scms.waikato.ac.nz
>>>> https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users
>>>>
>>
>>
>
> _______________________________________________
> greenstone-users mailing list
> greenstone-users@list.scms.waikato.ac.nz
> https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users
>