[greenstone-users] Greenstone 2.71 released!

From Katherine Don
DateWed, 11 Oct 2006 15:58:49 +1300
Subject [greenstone-users] Greenstone 2.71 released!
Hi everyone,

We are pleased to announce that Windows, GNU/Linux, Mac OS/X and
Source distributions of Greenstone v2.71
are now available for download from our sourceforge page:
http://sourceforge.net/projects/greenstone, or from greenstone.org

This is a feature release, and contains HEAPS of new functionality.
Release notes are copied below.

As always, please report any problems or bugs to the mailing list.

On behalf of the Greenstone team.

Greenstone 2.71 release notes:

New Windows, GNU/Linux, Mac OS/X and Source distributions available:
Greenstone v2.71

The Windows, GNU/Linux, Mac OS/X and Source distributions of Greenstone
are now available for download from our sourceforge page:
http://sourceforge.net/projects/greenstone, or via http://www.greenstone.org

This release is a "feature" release which means it contains a lot of new
functionality, some of which may be a bit flaky. Unless you are using
Greenstone in a production environment, we encourage you to upgrade and
report any bugs found so the next release will be a stable one. It is
also a
lot easier for us to help you if you are using the latest release.

New features and changes include, in no particular order:

- new Format panel, containing all options relating to collection
These options don't require a rebuild to take effect. Preview Collection
button now also on this Format panel.
- default indexer for new collections is MGPP
- searchtypes now a format statement instead of a design option.
- new Macros section on Format panel - can edit collection's extra.dm
- redesign of Search Indexes section of Design panel.
- Search indexes now displays options for stem, casefolding and
(MG and MGPP only). If selected, the appropriate index will be built.
Search preferences for these options will depend on the appropriate index
being built.
- new Search section in Format panel: can edit display text for search form
drop down lists (indexes, subcollections, levels etc)
- metadata set management now on Enrich panel (Manage Metadata Sets
- New collections default to using the Dublin Core metadata set, and no set
prompt is given. This can be changed from the Enrich panel.
- updated help text
- now restarts itself if a new language is selected.
- plugins.dat and classifiers.dat no longer used. Plugin and Classifier
information is dynamically loaded when needed.

- reimplementation to make a simpler interface
- only one set is now open at a time
- A predefined set of attributes for set/element is provided
- Is launched from GLI to create a new metadata set or edit an existing one

Collection Exporting:
- File->Export in GLI. Now supports exporting as Greenstone Archive (GA),
DSpace batch import, Greenstone METS, and MARCXML formats.
- All export types support the use of XSLT to transform resulting XML
For example, could export to GA format, then transform to a custom format
using an XSLT file

Export to CDROM:
- now has a -noinstall option, so that the resulting CDROM doesn't install
anything onto the host computer

- maxnumeric support added
- query term truncation (e.g. comput*) with casefolding fixed for non-ascii
query terms
- accent folding support added (thanks to Juan Grigera): Generates a
new index which folds accents, in the same way that case folding
works. Accent folding means that □ will match e, and vice versa.
This is turned on by the user via the preferences page.
- Mongolian unicode support

- Mongolian unicode support

- upgraded to version 2.0.0
- now supports return of search term frequencies, and search term
- user will be notified if stop words were included in the query
- search result sorting by metadata: search results can be sorted
by any metadata that an index was built on.
- Improvements thanks to DL Consulting

- ConvertToPlug: -keep_original_filename option. Original file will be
using original name rather than doc.pdf, doc.doc etc.
- ISISPlug: - "^*" values for each field extracted, which contain the first
subfield value.
- Logically deleted records are now ignored
- CSVPlug: - processes comma separated value files. The first line is a
list of metadata names and subsequent lines, one per record, contain the
values. A new document will be created for each record.
- MetadataCSVPlug - processes comma separated files as above. Requires a
filename field which contains the file name of the document to which the
record metadata will be assigned.
- HTMLPlug - new -extract_style option. If set, will extract any style
information from the HTML head tag (style, script, link) and store it
as ex.DocumentHeader metadata.

Collection Building Options:
- -sortmeta option to import.pl now takes a comma separated list of
- -OIDmetadata option to import.pl: if -OIDtype is 'assigned', then
-OIDmetadata indicates which metadata element holds the identifiers,
rather than always using dc.Identifier

Incremental Building:
- GLI supports a 'minimal rebuild', which is not incremental building.
There are two stages to collection building, importing and building, and
one or both of these may be suppressed.
- If minimal rebuild is selected: importing will only be carried out if
documents have been added/removed, metadata has been edited, or plugins
have changed. Building will only be carried out if design options
have changed. If nothing in the collection has changed (apart from
formatting), then clicking Build Collection will do nothing.

- Command line building supports incremental addition of documents.
- -incremental option to import.pl will only import files that have a newer
timestamp than the archives.inf file in the archives directory.
- -incremental option to buildcol.pl is only supported for Lucene
The archives.inf file will indicate which archives have been previously
built, and only new ones will be indexed. The GDBM database is recreated
without needing to re-read all the old archives. If you no longer have
initial building directory, then use the '-builddir index' to index the
new documents into the current index.


- New Institutional Repository style web based interface used for adding
documents and metadata into an existing collection.
- Interface customisable using macro files
- Collection design not handled by this interface.
- Enable it by setting 'depositor enabled' in etc/main.cfg
- Will use incremental import, and incremental building for Lucene

Metadata Database Exploding:
- can now obtain multiple documents for a record (and assign the
metadata to
each one)
- new -records_per_folder option, which explodes the records into multiple
subfolders. This can speed up metadata editing in the GLI.
- Refer and BibTex files now explode.

- Downloading files/records via HTTP, FTP, Z3950, SRW, OAI now
supported, via
the modified Download panel in GLI

- tidied up parsing of the fielded search form (mgpp and lucene) -
of an individual field in the form is the same as a single line search
- simple and advanced modes consistent between single line and fielded
- pressing enter in a fielded search form now submits the form.
- search preferences reorganised - hopefully easier to understand
- more help text, particularly for advanced and fielded searching
- casefold, stemming and accentfold preferences now dependent on collection

Metadata and Formatting:
- Date metadata and DateList now accept yyyy-mm-dd as well as yyyymmdd
- [Date] and [Language] now display the raw value. [format:Date] and
[format:Language] display the formatted values. For example,
Date=20010204: [Date] => 20010204 [format:Date] => 4 February 2001
Language=fr: [Language] => fr [format:Language] => French
- This formatting uses macros (_iso639:iso639xx_ macros in languages.dm,
in base.dm) so is customisable.
- new Greenstone metadata set, greenstone.mds, namespace 'gs'. Contains
metadata elements that are special to Greenstone, and can be edited.
Currently, there is only one element: gs.DocumentHeader. If set, the
value will be used for the macro
- macro used in HTML head tag in document
Set by the receptionist to the value of gs.DocumentHeader or
ex.DocumentHeader. This allows documents to use individual style
such as CSS stylesheets.

Interface translations:

Arabic: Updated interface, many thanks to Usama Salama.
Updated GLI interface, many thanks to Kamal Salih Mustafa Khalafala.
Armenian: Updated interface, many thanks to Tigran Zargaryan.
Beginnings of auxiliary macrofile translation, many thanks to
Tigran Zargaryan.
Chinese: New GLI interface, many thanks to Li Chu, Jun Zhu, Li Zhang,
Qi Chen, and Linlin Zhong from CIT department, Fudan University,
Shanghai, China.
Gaelic: Updated interface, many thanks to Rita Campbell and Laurinda
German: Updated interface, many thanks to Sheshagin Kulkarni
Hebrew: Some updates, many thanks to Galina Bachmanova
Indonesian: Updated interface, many thanks to Dewa Asmara Widjaksana
Latvian: Updated interface, many thanks to Raitis Brodezhonok.
Marathi: Beginnings of interface, many thanks to Shubhada Nagarkar.
Mongolian: Updated interface, many thanks to Mendbayar Ichinkhorlo.
Urdu: Beginnings of interface, many thanks to Ata ur Rehman and Iqbal Rana.
Slovak: Beginnings of interface, many thanks to Tomas Fiala.
Demo collection strings translated, many thanks to Tomas Fiala.
Spanish: Updated interface, many thanks to Jesus Tramullas.
Updated perl strings, many thanks to Jesus Tramullas.
Thai: Updated interface, many thanks to Yenruedee Chanwirawong.

...and many other minor improvements and bug fixes

We want to ensure that Greenstone works well for you. Please report any
problems to one of the Greenstone mailing lists.