Re: [greenstone-devel] Version 2.72 and CDS/ISIS - order of subfileds]

From Michael Dewsnip
DateWed, 20 Dec 2006 10:39:10 +1300
Subject Re: [greenstone-devel] Version 2.72 and CDS/ISIS - order of subfileds]
In-Reply-To (458249E2-8050909-inwind-it)
Hi Ruben,

Don't you get the effect you're after just by setting the
"-subfield_separator" option to " - "? The default for this option is ",
" which is why you get "All□gre, Claude, interviewer".

All the best,

Michael

ruben pandolfi wrote:

> Hello John,
>
> Thank you very much for your work and for your message,
>
> I have a litte question about CDS/ISIS, do you think is possible to
> maintain subfield order and relation?
>
> Example:
>
> extract from doc.xml
>
> ......................................................................-
> <Description>
> <Metadata
> name="gsdlsourcefilename">import/BABEL3.00000301/00000400.nul</Metadata>
> <Metadata name="gsdldoctype">indexed_doc</Metadata>
> <Metadata name="Plugin">NULPlug</Metadata>
> <Metadata name="Source">00000400.nul</Metadata>
> <Metadata name="FileSize">0</Metadata>
> <Metadata name="null_file">00000400.nul</Metadata>
> <Metadata name="dc.DcContributor^a">Morin, Edgar</Metadata>
> <Metadata name="dc.DcContributor^a">All□gre, Claude</Metadata>
> <Metadata name="dc.DcContributor^*">Morin, Edgar</Metadata>
> <Metadata name="dc.DcContributor^*">All□gre, Claude</Metadata>
> <Metadata name="dc.DcDate">1994</Metadata>
> <Metadata name="dc.DcType">entretien</Metadata>
> <Metadata name="dc.ModsRecordInfoRecordIdentifi">400</Metadata>
> <Metadata name="dc.DcContributor^q">interviewer</Metadata>
> <Metadata name="dc.DcTitle">Edgar Allegr□ment</Metadata>
> <Metadata name="dc.ModsOriginInfoPlace">AR</Metadata>
> <Metadata name="dc.DcContributor">Morin, Edgar</Metadata>
> <Metadata name="dc.DcContributor">All□gre, Claude, interviewer</Metadata>
> <Metadata name="dc.DcLanguage">fre</Metadata>
> <Metadata name="dc.DcIdentifier">babel-id-400</Metadata>
> -
> <Metadata name="dc.ISISRawRecord">
> tag=10 data=Edgar Allegr□ment
> tag=20 data=entretien
> tag=30 data=^aMorin, Edgar%^aAll□gre, Claude^qinterviewer
> tag=40 data=fre
> tag=65 data=AR
> tag=85 data=1994
> tag=90 data=babel-id-400
> tag=190 data=400
> </Metadata>
> <Metadata name="Title">00000400</Metadata>
> <Metadata name="Identifier">HASH01f630c90dd3b1b10d811117</Metadata>
> <Metadata name="assocfilepath">HASH01f6.dir</Metadata>
> </Description>
>
> ......................................................................
>
> In this case I would like to display the isis record:
>
> tag=30 data=^aMorin, Edgar%^aAll□gre, Claude^qinterviewer
>
>
> as
>
> Morin, Edgar
> All□gre, Claude - interviewer
>
>
> the closer I can get at the moment is with:
> ......................................................................
>
> {If}{[dc.DcContributor], <tr class="metadata"><td
> valign=top><b>_Co_:</b></td><td valign=top>[sibling(all\' <br />
> '):dc.DcContributor]</td></tr>}
> ......................................................................
>
> this displays:
>
> ......................................................................
> Author: Morin, Edgar
> All□gre, Claude, interviewer
> ......................................................................
>
> but I do not have a way to format dc.DcContributor^q
>
> On the other hand, if I take every signle subfield I always get
> ......................................................................
> Author: Morin, Edgar, interviewer
> All□gre, Claude
> ......................................................................
>
> because GSDL does not keep the order or association of subfileds,
>
> Do you think there is a way to achive that in importing/exploding
> cds/isis db ?
>
> Thank you
>
> Ruben
>
>
>
> John Rose ha scritto:
>
>> Dear Greenstone users/developers,
>>
>> I have been working with the Greenstone team to ensure liaison with
>> CDS/ISIS users, and am taking this opportunity to list (in more
>> detail than in the release announcements) the improvements for
>> CDS/ISIS database conversion in Greenstone version 2.72 relative to
>> version 2.70 (most of these functions were available in 2.71 but some
>> had bugs or have been further improved, so CDS/ISIS users wishing to
>> benefit from them are advised to upgrade to 2.72):
>>
>> 1. The ^* metadata element is available to access the first subfield
>> of a field with subfields (even if it is the main field without a
>> delimiting prefix).
>>
>> 2. Backslashes in a CDS/ISIS field (e.g. Windows file paths) will
>> display correctly with Greenstone formatting language.
>>
>> 3. Support for DOS 852 coding (needed for DOS-based CDS/ISIS
>> databases in Eastern European languages).
>>
>> 4. Logically deleted records will not be imported (with prior
>> versions, you had to export to an ISO file and re-import into
>> CDS/ISIS before converting to Greenstone)
>>
>> 5. A "-records_per_folder" option has been added to the explode
>> function. This puts the records from exploding a metadata database
>> into multiple subdirectories, which means that the GLI should use
>> less memory and edit the metadata more quickly. This option has not
>> yet been tested for its usefulness in real conversion situations; it
>> may be tried for large databases in which the time for explosion
>> seems inordinately long. The default value is 100, so you can try a
>> lower value, say 10.
>>
>> 6. A bug under Linux, by which the CDS/ISIS files with filenames in
>> capital letters were not handled correctly, has been fixed
>> (previously the filenames had to be changed manually to small letters
>> before dragging them into GLI).
>>
>> 7. '&' characters and spaces in filenames now work in the
>> "document_field" parameter of the explode function (previously, the
>> corresponding documents were not imported).
>>
>> 8. When the "document_field" CDS/ISIS field is repeatable, each
>> occurrence will yield a separate Greenstone document, each with the
>> same metadata (previously only the first occurrence was imported).
>>
>> 9. Building a CDS/ISIS collection (either "as is", i.e. metadata
>> only, or by exploding) should be significantly faster in Greenstone
>> v2.71, as it no longer tries to determine the encoding of the
>> CDS/ISIS file.
>>
>> All reported problems with the "as is" conversion of large CDS/ISIS
>> databases with GLI seem to have been resolved with v2.72 - one user
>> has successfully converted a database of 38,000 records. On the other
>> hand, GLI may fail at the explode step because it wasn't designed to
>> handle huge amounts of metadata (typically when approaching 15,000
>> CDS/ISIS records, but possibly less or greater depending on the size
>> of the records); in this case, the command line may be used, and I
>> will shortly be posting to the Wiki a summary of this process for
>> basic Greenstone users. Please do report to the discussion lists any
>> problems encountered in CDS/ISIS conversions.
>>
>> With best regards,
>> John Rose
>>
>>
>>
>>
>> John B. Rose
>> Honorary Research Associate, University of Waikato
>> S□vres, France
>> Email: <johnrose@alumni.caltech.edu>
>>
>> _______________________________________________
>> greenstone-devel mailing list
>> greenstone-devel@list.scms.waikato.ac.nz
>> https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-devel
>>
>
>