|We exported some 20.000 documents (records) from DB/TextWorks (Inmagic,
Inc - USA) in tagged format, i.e. with delimiters to indicate each
paragraphs and wraparounds. (This is in ASCII and all in one big file).
For instance the Author field has a tag AU followed by a blank space and
then the names of the authors, the Title field has a tag TI also
a space and then the document's title, wraparound is indicated by a
space on the next line followed by the rest of the text, a new paragraph
forced new line is indicated by a ";" (semicolon) followed by a blank
then the text, etc. Each document is ended by a "$" (Dollar) sign and
immediately the next document begins. All 20.00 documents are thus in
big file directly following one another. All of the fields are not
in all of the documents. Some documents may have descriptors, others
etc. All of the (same) fields in all of the docs are not equal in
All of the different fields are not equal in length as well.
We wish to import this into Greenstone, with the documents separated.
Greenstone should tell me I have 20.000 docs or records or titles. I
thought of doing this using the Organizer, but was told that the
not work for this.
(For this exercise, I use Greenstone on my PC, a HP KAYAK XA.)
1. Is this feasible at all?
2. How do I tell the Greenstone that each document ends with the "$"
and the next line in the file is the number (or what the case may be) of
3. If this is feasible, is there a method to maybe later on export the
new collection in ASCII format (in the same way as with DB/TextWorks)?
I know the Collector, but have no experience with the Organizer.
I will appreciate any help or tips. I've read the FAQ, but I don't
my specific problem in there. I also have the User's and Developer's
Guides, but haven't found anything in there to help me, my I just don't
n:D W J FOURIE;David
tel;fax:+27 (0)12 420 4658, +27 (0)12 362 5168
tel;work:+27 (0)12 420 4080
org:Stelselontwikkeling, Departement Inligtingtegnologie;www.up.ac.za/services/it/DavidWJFourie
adr;quoted-printable:;;K2-48=0D=0AAdministrasiegebou (oosvleuel)=0D=0AHoofkampus=0D=0AUNIVERSITEIT van PRETORIA;Pretoria;Transvaal;0002;Suid-Afrika
note;quoted-printable:"Malamutes are more decent than most human beings" =0D=0ARobert Zoller, 1994 =0D=0A
fn:David W J Fourie [BBibl] [DDatamet]