Re: [greenstone-devel] Version 2.72 and CDS/ISIS - order of subfileds]

From ruben pandolfi
DateFri, 22 Dec 2006 06:24:44 +0100
Subject Re: [greenstone-devel] Version 2.72 and CDS/ISIS - order of subfileds]
In-Reply-To (45885BFE-6010508-cs-waikato-ac-nz)
Thank you Michael,

yes, I did not thought about it.

Getting full control of the occurency will be better, because my final=20
aim was:

All=E8gre, Claude, <i>interviewer</i>

but

All=E8gre, Claude - interviewer

will do for now!

happy Xmas from Italy to everybody and enjoy your sun and sea!

ruben

Michael Dewsnip ha scritto:
> Hi Ruben,
>=20
> Don't you get the effect you're after just by setting the
> "-subfield_separator" option to " - "? The default for this option is "=
,
> " which is why you get "All=E8gre, Claude, interviewer".
>=20
> All the best,
>=20
> Michael
>=20
>=20
>=20
> ruben pandolfi wrote:
>=20
>> Hello John,
>>
>> Thank you very much for your work and for your message,
>>
>> I have a litte question about CDS/ISIS, do you think is possible to
>> maintain subfield order and relation?
>>
>> Example:
>>
>> extract from doc.xml
>>
>> ......................................................................=
-
>> <Description>
>> <Metadata
>> name=3D"gsdlsourcefilename">import/BABEL3.00000301/00000400.nul</Metad=
ata>
>> <Metadata name=3D"gsdldoctype">indexed_doc</Metadata>
>> <Metadata name=3D"Plugin">NULPlug</Metadata>
>> <Metadata name=3D"Source">00000400.nul</Metadata>
>> <Metadata name=3D"FileSize">0</Metadata>
>> <Metadata name=3D"null_file">00000400.nul</Metadata>
>> <Metadata name=3D"dc.DcContributor^a">Morin, Edgar</Metadata>
>> <Metadata name=3D"dc.DcContributor^a">All=E8gre, Claude</Metadata>
>> <Metadata name=3D"dc.DcContributor^*">Morin, Edgar</Metadata>
>> <Metadata name=3D"dc.DcContributor^*">All=E8gre, Claude</Metadata>
>> <Metadata name=3D"dc.DcDate">1994</Metadata>
>> <Metadata name=3D"dc.DcType">entretien</Metadata>
>> <Metadata name=3D"dc.ModsRecordInfoRecordIdentifi">400</Metadata>
>> <Metadata name=3D"dc.DcContributor^q">interviewer</Metadata>
>> <Metadata name=3D"dc.DcTitle">Edgar Allegr=E8ment</Metadata>
>> <Metadata name=3D"dc.ModsOriginInfoPlace">AR</Metadata>
>> <Metadata name=3D"dc.DcContributor">Morin, Edgar</Metadata>
>> <Metadata name=3D"dc.DcContributor">All=E8gre, Claude, interviewer</Me=
tadata>
>> <Metadata name=3D"dc.DcLanguage">fre</Metadata>
>> <Metadata name=3D"dc.DcIdentifier">babel-id-400</Metadata>
>> -
>> <Metadata name=3D"dc.ISISRawRecord">
>> tag=3D10 data=3DEdgar Allegr=E8ment
>> tag=3D20 data=3Dentretien
>> tag=3D30 data=3D^aMorin, Edgar%^aAll=E8gre, Claude^qinterviewer
>> tag=3D40 data=3Dfre
>> tag=3D65 data=3DAR
>> tag=3D85 data=3D1994
>> tag=3D90 data=3Dbabel-id-400
>> tag=3D190 data=3D400
>> </Metadata>
>> <Metadata name=3D"Title">00000400</Metadata>
>> <Metadata name=3D"Identifier">HASH01f630c90dd3b1b10d811117</Metadata>
>> <Metadata name=3D"assocfilepath">HASH01f6.dir</Metadata>
>> </Description>
>>
>> ......................................................................
>>
>> In this case I would like to display the isis record:
>>
>> tag=3D30 data=3D^aMorin, Edgar%^aAll=E8gre, Claude^qinterviewer
>>
>>
>> as
>>
>> Morin, Edgar
>> All=E8gre, Claude - interviewer
>>
>>
>> the closer I can get at the moment is with:
>> ......................................................................
>>
>> {If}{[dc.DcContributor], <tr class=3D"metadata"><td
>> valign=3Dtop><b>_Co_:</b></td><td valign=3Dtop>[sibling(all\' <br />
>> '):dc.DcContributor]</td></tr>}
>> ......................................................................
>>
>> this displays:
>>
>> ......................................................................
>> Author: Morin, Edgar
>> All=E8gre, Claude, interviewer
>> ......................................................................
>>
>> but I do not have a way to format dc.DcContributor^q
>>
>> On the other hand, if I take every signle subfield I always get
>> ......................................................................
>> Author: Morin, Edgar, interviewer
>> All=E8gre, Claude
>> ......................................................................
>>
>> because GSDL does not keep the order or association of subfileds,
>>
>> Do you think there is a way to achive that in importing/exploding
>> cds/isis db ?
>>
>> Thank you
>>
>> Ruben
>>
>>
>>
>> John Rose ha scritto:
>>
>>> Dear Greenstone users/developers,
>>>
>>> I have been working with the Greenstone team to ensure liaison with
>>> CDS/ISIS users, and am taking this opportunity to list (in more
>>> detail than in the release announcements) the improvements for
>>> CDS/ISIS database conversion in Greenstone version 2.72 relative to
>>> version 2.70 (most of these functions were available in 2.71 but some
>>> had bugs or have been further improved, so CDS/ISIS users wishing to
>>> benefit from them are advised to upgrade to 2.72):
>>>
>>> 1. The ^* metadata element is available to access the first subfield
>>> of a field with subfields (even if it is the main field without a
>>> delimiting prefix).
>>>
>>> 2. Backslashes in a CDS/ISIS field (e.g. Windows file paths) will
>>> display correctly with Greenstone formatting language.
>>>
>>> 3. Support for DOS 852 coding (needed for DOS-based CDS/ISIS
>>> databases in Eastern European languages).
>>>
>>> 4. Logically deleted records will not be imported (with prior
>>> versions, you had to export to an ISO file and re-import into
>>> CDS/ISIS before converting to Greenstone)
>>>
>>> 5. A "-records_per_folder" option has been added to the explode
>>> function. This puts the records from exploding a metadata database
>>> into multiple subdirectories, which means that the GLI should use
>>> less memory and edit the metadata more quickly. This option has not
>>> yet been tested for its usefulness in real conversion situations; it
>>> may be tried for large databases in which the time for explosion
>>> seems inordinately long. The default value is 100, so you can try a
>>> lower value, say 10.
>>>
>>> 6. A bug under Linux, by which the CDS/ISIS files with filenames in
>>> capital letters were not handled correctly, has been fixed
>>> (previously the filenames had to be changed manually to small letters
>>> before dragging them into GLI).
>>>
>>> 7. '&' characters and spaces in filenames now work in the
>>> "document_field" parameter of the explode function (previously, the
>>> corresponding documents were not imported).
>>>
>>> 8. When the "document_field" CDS/ISIS field is repeatable, each
>>> occurrence will yield a separate Greenstone document, each with the
>>> same metadata (previously only the first occurrence was imported).
>>>
>>> 9. Building a CDS/ISIS collection (either "as is", i.e. metadata
>>> only, or by exploding) should be significantly faster in Greenstone
>>> v2.71, as it no longer tries to determine the encoding of the
>>> CDS/ISIS file.
>>>
>>> All reported problems with the "as is" conversion of large CDS/ISIS
>>> databases with GLI seem to have been resolved with v2.72 - one user
>>> has successfully converted a database of 38,000 records. On the other
>>> hand, GLI may fail at the explode step because it wasn't designed to
>>> handle huge amounts of metadata (typically when approaching 15,000
>>> CDS/ISIS records, but possibly less or greater depending on the size
>>> of the records); in this case, the command line may be used, and I
>>> will shortly be posting to the Wiki a summary of this process for
>>> basic Greenstone users. Please do report to the discussion lists any
>>> problems encountered in CDS/ISIS conversions.
>>>
>>> With best regards,
>>> John Rose
>>>
>>>
>>>
>>>
>>> John B. Rose
>>> Honorary Research Associate, University of Waikato
>>> S=E8vres, France
>>> Email: <johnrose@alumni.caltech.edu>
>>>
>>> _______________________________________________
>>> greenstone-devel mailing list
>>> greenstone-devel@list.scms.waikato.ac.nz
>>> https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-devel
>>>
>>
>=20
>=20


--=20

..................
..................

Ruben Pandolfi

-------------------------------------------------------------
"...I Think This is the Beginning of a Beautiful Friendship."
-------------------------------------------------------------