|Expanding Access to Science and Technology (UNU, 1994, 462 pages)|
|Session 2b: The technological experience: information resources and networks|
|Databases and data banks|
6.1 Paper Products
Not very long ago, the principal product produced from databases was in printed form. Bibliographic bulletins, under various names, contained the whole or part of the database. Their manufacture and distribution did not require any special advanced technology, and the revenue from the subscriptions, received in advance, represented a financial guarantee for the producers. The products, still widely distributed, are nevertheless being replaced little by little by other methods of accessing the stored information. Microforms, for which a short-term fashion led to the thought that they might replace paper products, have not gone beyond the status of storage media. If they do not disappear altogether, they will instead become an adjunct to paper products.
6.2 On-line Access
On-line distribution of databases began at the end of the 1960s as a part of timesharing systems. The 1970s saw the beginning of their growth. The first search systems such as STAIRS, RECON, and ELHILL/ORBIT are still recognized names. The National Library of Medicine and NASA were pioneers in the field. The systems were at the time limited by poor telecommunications and by prohibitive storage costs.
Nevertheless, the market began to emerge, as did the idea of hosts, organizations that take responsibility for the distribution of databases produced by others. These hosts provide search systems, storage media, networks, and user training, in general in collaboration with the producers. The hosts can be classified either as supermarkets or as specialists. Some producers are at the same time hosts.
Although relations between database producers and hosts are generally good, there is beginning to be a tendency towards competition or even sometimes confrontation (a recent example illustrates this). A recent article by Harry F. Boyle of Chemical Abstracts Service, published in the ICSTI Proceedings, 1991, on the relations between hosts and guests describes in detail what currently happens.
It is appropriate to note that competition between producers is significant in that more than 5,000 databases are available on 800 hosts. The users are becoming more demanding with respect to the services that they want to see. Some of them are beginning to feel that existing on-line systems, based on the Boolean model, are by nature limited. It is true that several other models have been proposed (vectorial, probabilistic, extended Boolean, fuzzy set) that aim to improve the performance of the Boolean model, but none has yet developed into a large-scale commercial application.
Josephine Maxon-Dadd of Dialog Information Services, in Trends in Database Design and Customer Services, published by NFAIS, has described the ideal database:
An easy link to full text
Controlled vocabulary (hierarchical) maintained and updated over the whole file
Uncontrolled vocabulary too, perhaps for trade names, proper names, or synonyms
Title, a reasonable number of authors, a good abstract
Bibliographic data fully identified and searchable
Complete coverage of every journal title included
No internal duplicates
Subject classification scheme (text and code searchable)
User-friendly scientific notation
I think that this eloquent list should make all database producers pause to think, above all when the same databases are more and more used to produce derivative products or are the subject of more and more sophisticated processing.
The CD-ROM (compact disc read-only memory) is a database distribution medium that was introduced some years ago for the storage of texts and graphics. It exhibits much the same advantages as the microfiche stores of earlier decades of texts. The disks are relatively easy to produce and to duplicate; they are also easy to ship from place to place, and can therefore be used for local storage of databases and for local retrieval activities. CD-ROMs also provide high-density storage for both text and graphics. A standard disk will store up to 600 million bytes of information.
When coupled with a personal computer, the potential of the medium is greatly enhanced. However, CD technology is somewhat hampered when the size of the database requires more than a disk. In this case, effecting a complete search of the database currently requires the user to change disks or to use jukeboxes.
6. 4 Floppy Disks
The floppy disk information delivery medium is emerging as an option for personal computers. Disks are easy to manufacture and can be produced inhouse. They can be produced for a variety of operating systems.
6.5 New Methods of Access to Information
Most attempts to improve access to information contained in databases are aimed at moving from documentation to information. In fact, on-line information is little used by companies despite their needs. According to recent figures, databases provide only 7 per cent of the total information processed by companies.
According to Nicolas Grandjean of Synthélabo, what the user really wants is the answer. Real user-friendliness is the relevance of the reply to the question asked, not to the information request. Databases give raw information, where the answer is hidden in primary documents. In addition, the answer is often complex, in that it requires the correlation between several documents. Grandjean does not think that we can stay with these relatively unsophisticated information systems, above all with the volume of information available today. New techniques now under study are providing answers by conceiving a new dimension to information systems.
The sum of the documents contained in a database possesses properties independent of the documents taken separately. These properties can be exploited, both in themselves and to design tools for aiding indexing and searching.