Re: [greenstone-users] Dead link & mutliple creators

From Michael Dewsnip
DateTue, 30 May 2006 15:38:17 +1200
Subject Re: [greenstone-users] Dead link & mutliple creators
In-Reply-To (c0f244f0605291956r1433f17ck13b9641e3ee91119-mail-gmail-com)
Hi Leon,

Can you please send me (off list), the metadata.xml file in the
"importGovernance ConferenceGovernance, Economic Growth, Sustainable
Development" folder.

Cheers,

Michael

Leon White wrote:

> Hi Michael,
>
> yes, it seems to contain the ex.* metadata. More interestingly, it
> seems to specifically omit the dc.Creator metadata, although as you
> can see from the previously included screenshot, that metadata has
> definitely been entered. I tried removing the data in the GLI and
> entering it again, which had no effect. Directly modifying the
> metadata in the XML file and rebuilding didnt work either because the
> build process overwrites whatever is in the archives folders I guess.
> Here's my two doc.xml files from the screenshot before:
>
> BAD (dc.Creator usually comes after dc.Language):
> <?xml version="1.0" encoding="UTF-8" standalone="no"?>
> <!DOCTYPE Archive SYSTEM "
> http://greenstone.org/dtd/Archive/1.0/Archive.dtd">
> <Archive>
> <Section>
> <Description>
> <Metadata name="gsdldoctype">indexed_doc</Metadata>
> <Metadata name="Language">en</Metadata>
> <Metadata name="Encoding">utf8</Metadata>
> <Metadata name="dc.Type">Conference Paper</Metadata>
> <Metadata name="dc.Language">English</Metadata>
> <Metadata name="dc.Source">Governance in Pacific States Development
> Research Symposium</Metadata>
> <Metadata name=" dc.Format">PDF</Metadata>
> <Metadata name="dc.Publisher">Pacific Institute of Advanced Studies in
> Development and Governance, University of the South Pacific</Metadata>
> <Metadata name=" dc.Subject">governance</Metadata>
> <Metadata name="dc.Subject">development research</Metadata>
> <Metadata name="dc.Subject">economic growth</Metadata>
> <Metadata name="dc.Subject">sustainable development</Metadata>
> <Metadata name="dc.Date">20030930</Metadata>
> <Metadata name="GENERATOR">pdftohtml 0.36</Metadata>
> <Metadata name="Creator">Pramendra Sharma and Mahendra Reddy</Metadata>
> <Metadata name="Title">Governance-Growth Nexus: An exposition and some
> examples from Fijiā–”s Financial Sector</Metadata>
> <Metadata
> name="URL">http://E:/Greenstone/gsdl/collect/dig-gov/tmp/PramendraSharmaandMahendraReddyGovernanceGrowthNexusAnexpositionandsomeexamplesfromFiji
> <http://E:/Greenstone/gsdl/collect/dig-gov/tmp/PramendraSharmaandMahendraReddyGovernanceGrowthNexusAnexpositionandsomeexamplesfromFiji>'sFinancialSector.html</Metadata>
> <Metadata name="gsdlsourcefilename">importGovernance
> ConferenceGovernance, Economic Growth, Sustainable
> DevelopmentPramendra Sharma and Mahendra Reddy - Governance-Growth
> Nexus- An exposition and some examples from Fiji's Financial
> Sector.pdf</Metadata>
> <Metadata
> name="gsdlconvertedfilename">tmpPramendraSharmaandMahendraReddyGovernanceGrowthNexusAnexpositionandsomeexamplesfromFiji'sFinancialSector.html</Metadata>
> <Metadata name="Source">Pramendra Sharma and Mahendra Reddy -
> Governance-Growth Nexus- An exposition and some examples from Fiji's
> Financial Sector.pdf</Metadata>
> <Metadata name="Plugin">PDFPlug</Metadata>
> <Metadata name="FileSize">270915</Metadata>
> <Metadata name="FileFormat">PDF</Metadata>
> <Metadata name="srclink">&lt;a
> href=&quot;/gsdl/collect/gsarch/index/assoc/[archivedir]/doc.pdf&quot;&gt;</Metadata>
>
> <Metadata name="srcicon">View the PDF document</Metadata>
> <Metadata name="/srclink">&lt;/a&gt;</Metadata>
> <Metadata name="Date">20060518</Metadata>
> <Metadata name="NumPages">17</Metadata>
> <Metadata name="Identifier">HASH99a7ffe247c508cd966910</Metadata>
> <Metadata name="assocfilepath">HASH99a7.dir </Metadata>
> <Metadata name="gsdlassocfile">doc.pdf:application/pdf:</Metadata>
> </Description>
> <Content>
>
>
> GOOD:
> <?xml version="1.0" encoding="UTF-8" standalone="no"?>
> <!DOCTYPE Archive SYSTEM
> "http://greenstone.org/dtd/Archive/1.0/Archive.dtd">
> <Archive>
> <Section>
> <Description>
> <Metadata name="gsdldoctype">indexed_doc</Metadata>
> <Metadata name="Language">en</Metadata>
> <Metadata name="Encoding">utf8</Metadata>
> <Metadata name="dc.Type">Conference Paper</Metadata>
> <Metadata name="dc.Language">English</Metadata>
> <Metadata name="dc.Creator">Parmod Chand</Metadata>
> <Metadata name="dc.Creator">Michael White</Metadata>
> <Metadata name="dc.Source">Governance in Pacific States Development
> Research Symposium</Metadata>
> <Metadata name=" dc.Format">PDF</Metadata>
> <Metadata name="dc.Coverage">Fiji</Metadata>
> <Metadata name="dc.Publisher">Pacific Institute of Advanced Studies in
> Development and Governance, University of the South Pacific</Metadata>
> <Metadata name="dc.Subject">governance</Metadata>
> <Metadata name="dc.Subject">development research</Metadata>
> <Metadata name="dc.Subject">economic growth</Metadata>
> <Metadata name="dc.Subject">sustainable development</Metadata>
> <Metadata name="dc.Date">20030930</Metadata>
> <Metadata name="GENERATOR">pdftohtml 0.36</Metadata>
> <Metadata name="Creator">Parmod Chand and Michael White</Metadata>
> <Metadata name="Title">Accountability Of The Private Sector To The
> People Of Fiji - Regulation Through Global Pressure And Professional
> Interests</Metadata>
> <Metadata
> name="URL">http://E:/Greenstone/gsdl/collect/dig-gov/tmp/ParmodChandandMichaelWhiteAccountabilityOfThePrivateSectorToThePeopleOfFiji.html
> <http://E:/Greenstone/gsdl/collect/dig-gov/tmp/ParmodChandandMichaelWhiteAccountabilityOfThePrivateSectorToThePeopleOfFiji.html></Metadata>
> <Metadata name="gsdlsourcefilename">importGovernance
> ConferenceGovernance, Economic Growth, Sustainable DevelopmentParmod
> Chand and Michael White - Accountability Of The Private Sector To The
> People Of Fiji.pdf</Metadata>
> <Metadata
> name="gsdlconvertedfilename">tmpParmodChandandMichaelWhiteAccountabilityOfThePrivateSectorToThePeopleOfFiji.html</Metadata>
> <Metadata name="Source">Parmod Chand and Michael White -
> Accountability Of The Private Sector To The People Of Fiji.pdf</Metadata>
> <Metadata name="Plugin">PDFPlug</Metadata>
> <Metadata name="FileSize">305672</Metadata>
> <Metadata name="FileFormat">PDF</Metadata>
> <Metadata name="srclink">&lt;a
> href=&quot;/gsdl/collect/gsarch/index/assoc/[archivedir]/doc.pdf&quot;&gt;</Metadata>
> <Metadata name="srcicon">View the PDF document</Metadata>
> <Metadata name="/srclink">&lt;/a&gt;</Metadata>
> <Metadata name="Date">20060519</Metadata>
> <Metadata name="NumPages">20</Metadata>
> <Metadata name="Identifier">HASH8ddfcbd6db3da2c4ce9b89</Metadata>
> <Metadata name="assocfilepath">HASH8ddf.dir</Metadata>
> <Metadata
> name="gsdlassocfile">ParmodChandandMichaelWhiteAccountabilityOfThePrivateSectorToThePeopleOfFiji-1_1.jpg:image/jpeg:</Metadata>
>
> <Metadata
> name="gsdlassocfile">ParmodChandandMichaelWhiteAccountabilityOfThePrivateSectorToThePeopleOfFiji-2_1.jpg:image/jpeg:</Metadata>
> <Metadata
> name="gsdlassocfile">ParmodChandandMichaelWhiteAccountabilityOfThePrivateSectorToThePeopleOfFiji-2_2.jpg:image/jpeg:</Metadata>
>
> <Metadata name="gsdlassocfile">doc.pdf:application/pdf:</Metadata>
> </Description>
> <Content>
>
> Weird?
>
> Cheers
> Leon
>
> On 5/30/06, *Michael Dewsnip* <mdewsnip@cs.waikato.ac.nz
> <mailto:mdewsnip@cs.waikato.ac.nz>> wrote:
>
> Hi Leon,
>
> Please find the doc.xml file for a problem document in the collection
> "archives" directory. Does it include the extracted metadata
> correctly?
>
> Regards,
>
> Michael
>
>
>
> Leon White wrote:
>
> > Hi Katherine, List,
> >
> > thanks for your elegant solution, and sorry about this long
> mail, but
> > I am experiencing some very strange behaviour... I added dc.Creator
> > metadata only to those publications with multiple authors,
> leaving the
> > rest with the automatically ex.Creator metadata. For the most part
> > this seems to be working perfectly, but there are some very strange
> > exceptions (bugs?).
> >
> > After a full build of the collection, returning to the 'Enrich' tab
> > shows that certain documents do not have any ex.Creator metadata,
> > indeed no ex.* metadata at all. This cannot be true, because the
> > titles and other stuff I use from the ex.* set is visible in the
> > collection frontend, yet it is exactly these publications which are
> > reverting to the old "<author1> and <author2>" display. Why do some
> > documents not get any ex metadata?? I have triple checked that the
> > PDFs do indeed have their metadata entered correctly. See the two
> > attached images for my view of the GLI 'Enrich' tab immediately
> after
> > a build.
> >
> > For examples search my collection
> >
> <http://www.rkb.usp.ac.fj/gsdl/cgi-bin/library.exe?site=localhost&a=p&p=about&c=dig-gov&ct=0&l=en&w=utf-8%2520rtekeep=
> <http://www.rkb.usp.ac.fj/gsdl/cgi-bin/library.exe?site=localhost&a=p&p=about&c=dig-gov&ct=0&l=en&w=utf-8%2520rtekeep=>>
> > for 'chand' (an example of things working properly) and 'pramendra'
> > (an example of the wrong behaviour). This can also clearly be
> seen in
> > the list of authors.
> >
> > Once this problem is fixed is it possible to adjust the
> > {Or}{[sibling:dc.Creator],[sibling:ex.Creator]} line so that it adds
> > an 'and' or a few nbsp's between the list of siblings? The
> separation
> > of metadata should be transparent to the end user.
> >
> > Thank you very much and I'm looking forward to your reply,
> > Leon
> >
> > On 5/29/06, *Katherine Don* <kjdon@cs.waikato.ac.nz
> <mailto:kjdon@cs.waikato.ac.nz>
> > <mailto: kjdon@cs.waikato.ac.nz
> <mailto:kjdon@cs.waikato.ac.nz>>> wrote:
> >
> > Hi Leon
> > Is your problem that the ex.Creator has a value like "smith, jones"
> > and you are getting a bookshelf "smith, jones" but you want two
> > bookshelves, "smith" and "jones"?
> >
> > We don't have any way of splitting metadata, so your best solution
> > would
> > be to assign dc.Creator metadata to those documents with multiple
> > extracted metadata, and use
> > AZCompactList -metadata dc.Creator,ex.Creator
> > If you then want to display the Creator for the document node
> (and not
> > just in the bookshelf), then use
> > {Or}{[sibling:dc.Creator],[sibling:ex.Creator]}
> > You want dc first so that it will use that if its othere,
> > otherwise use
> > ex.Creator.
> > sibling will display all values of multivalued metadata.
> >
> > Hope this helps,
> > Katherine
> >
> > > Specifically, my problem is as follows: I am importing metadata
> > such as
> > > ex.Title and ex.Creator from the PDF document properties through
> > > PDFPlug. I want to generate an AZCompactList from the ex.Creator
> > data.
> > > However, some documents have mutliple authors entered, which
> > means these
> > > authors appear both on their own and once again together with
> their
> > > co-authors on the list. Ideally the two authors should appear
> > > individually in the list with the document credited to each of
> them
> > > exactly once. Can I establish an overide along the lines of
> {Or}{[
> > > ex.Creator],[dc.Creator]} for this, and manually enter author
> > data for
> > > documents with mutliple authors?
> > >
> >
> >
> >
> >
> ------------------------------------------------------------------------
>
> >
> >
> >
> ------------------------------------------------------------------------
> >
> >------------------------------------------------------------------------
> >
> >_______________________________________________
> >greenstone-users mailing list
> >greenstone-users@list.scms.waikato.ac.nz
> <mailto:greenstone-users@list.scms.waikato.ac.nz>
> > https://list.scms.waikato.ac.nz/mailman/listinfo/greenstone-users
> >
> >
>
>