|Expanding Access to Science and Technology (UNU, 1994, 462 pages)|
|Session 3: New technologies and media for information retrieval and transfer|
|Computerized front-ends in retrieval systems|
Several front-ends and related knowledge bases are briefly described to illustrate the state of the art, with particular attention to support for accessing scientific, technical, and medical information.
4.1 Medicine: Grateful Med and Loansome Doc
To allow physicians and other health care professionals to search a variety of medical databases, such as Medline available on the National Library of Medicine's (NLM) MEDLARS system, staff at the National Library of Medicine have developed Grateful Med for the PC , with Version 6.0 scheduled for release in June 1992 . It assists with menu-driven off-line entry of strategies. Once on-line, it automatically reformats the terms entered into MEDLARS commands, executes the search, saves the results, logs off the system, reformats and displays the citations. The Grateful Med software generates suggested controlled vocabulary terms (Medical Subject Headings) based on retrieved Medline citations. When a strategy results in zero retrieval, a help screen is available that offers suggestions for modifying search strategies. COACH, an expert searcher program to help Grateful Med users improve their retrieval, is currently under development.
Since Grateful Med accesses bibliographic databases, users also need assistance in locating the actual documents. Loansome Doc, introduced in 1991, allows the individual user to place an on-line order for a copy of the full article for any reference retrieved from Medline. If the user's library can fill the document request directly or if it is filled through interlibrary loan, the user receives a photocopy by the preferred delivery method (e.g., mail or fax).
4.2 Medicine: Unified Medical Language System
The goal of the National Library of Medicine's Unified Medical Language System (UMLS) project is to give easy access to machine-readable information from diverse sources, including the scientific literature, patient records, factual data banks, and knowledge-based expert systems . The barriers to integrated access to information in these sources include: the variety of ways the same concepts are expressed in the different machine-readable sources (and by users themselves), and the difficulty of identifying which of many existing databases have information relevant to particular questions. The UMLS approach to overcoming these barriers is to develop "knowledge sources" that can be used by a wide variety of application programs to compensate for differences in the way concepts are expressed, to identify the information sources most relevant to a user query, and to carry out the telecommunications and search procedures necessary to retrieve information from these information sources.
The three UMLS knowledge sources are: (I) a metathesaurus of concepts and terms from several biomedical vocabularies and classifications; (2) a semantic network of the relationships among semantic types or categories to which all concepts in the metathesaurus are assigned; and (3) an information sources map that describes the content and access conditions for the available biomedical databases in both human-readable and machine-readable form. Objectives for the next three years of the UMLS project are to develop and implement important applications that rely on the UMLS knowledge sources, to establish production systems for ongoing expansion and maintenance of the knowledge sources, and to expand the content of the knowledge sources to support the applications being developed. The NLM plans to develop these capabilities within its existing user interface, Grateful Med. and in COACH. For example, COACH uses the metathesaurus to augment user search terms and to help find new terms.
4.3 Environment: ANSWER
ANSWER is a stand-alone microcomputer-based workstation designed for use by health professionals and related personnel in US state and federal agencies responding to hazardous chemical situations. It was developed by the Toxicology Information Program of the National Library of Medicine for the Agency for Toxic Substances and Disease Registry. ANSWER illustrates the possibilities of using local data access and front-end capabilities in support of problem solving in emergency situations. ANSWER includes: a CD-ROM database with information on the medical and hazard management of exposure to over 1,000 hazardous substances; a database of information on previous chemical emergencies; a gateway to the National Weather Service's on-line information (automatic dial-up, log-on, and data capture for state, regional, and local weather information); an air dispersion modelling package for determining plume path and dispersion; a front-end for access to chemical, toxicological, and hazardous waste files located in various governmental and private sector on-line systems; and a report generation capability for editing, sorting, merging, and transforming retrieved data files.
4.4 Environment: Eco-Link
Eco-Link was developed as an electronic research system to take advantage of electronic sources of information on the environment and to coordinate their acquisition, storage, and presentation . It integrates a wide variety of data from electronic sources relating to the environment. The heart of the system consists of download-filter-manage software routines that automate access to electronic databases and process the acquired information so as to merge data from a broad range of different sources in a common set of locally constructed databases. EcoLink standardizes output from on-line catalogues reachable through the Internet, bibliographic citations from locally mounted databases for newspapers and journal articles, and information from commercial vendors of full text, directories, news sources, and statistical data.
4.5 Chemistry: Graphics Front-ends
Chemical structure searching presents a need for customized front-ends, allowing the scientist to use two-dimensional chemical structure diagrams. Graphics front-ends support the off-line building of chemical structure graphics and subsequent uploading to a host computer, as well as the capture (downloading) of retrieved records. Warr and Wilkins  have reviewed the key features of a number of these graphics front-ends, such as STN Express, the front-end software that provides access to STN international databases. STN Express enables one to prepare off-line the strategy formulation (including structural query formulation) and then upload the search strategy line by line after logging on. Off-line chemical structure building is menu-driven. In addition to the ability to create search strategies off-line, the program provides predefined search strategies for general subjects, such as toxicity, that take advantage of individual databases provided by STN. The MOLKICK software package allows the user to enter chemical structures and then translates them into the proper format for searching in three different host systems (STN, Télésystèmes Questel, Dialog) .
4.6 Engineering: Ei Reference Desk
Engineering Information Inc. (Ei) has been developing an integrated software package that is designed to bring together both the searching and retrieval of documents [4, 28]. Users will have a choice of browsing through electronic tables of contents for engineering journals, searching COMPENDEX PLUS on CD-ROM, accessing other databases through a telecommunications link, and marking documents for automatic ordering and delivery from Ei's document delivery service. Each function of the Reference Desk has been implemented as a separate application but integrated within the Windows graphical interface. A planned enhancement is an electronic mail function.
4.7 The Livermore Intelligent Gateway
The Livermore Intelligent Gateway creates a framework that links distributed, heterogeneous computer resources and provides a single user interface such that a "virtual information system" can be tailored to any user's needs . In addition to extensive data access capabilities, the Gateway system provides powerful analysis and processing tools to complete the creation of an integrated information environment. Once connected to the selected host, the user may interact in the system's native mode, use a Gateway overlaid common command language, or execute a fully automated search and retrieval procedure for routine tasks. Having simplified access to and retrieval of information, be it bibliographic, numeric, or graphic, the Gateway provides a tool kit to further analyse and repackage the information. Post-processing tools fall into two major categories: analysis of numeric data through statistical, mathematical, and graphics software, and analysis and restructuring of text through translation and analysis routines. In addition to the analytical tool kit, the Gateway provides sophisticated electronic mail capability as well as a wide variety of Unix utilities such as text editors and document preparation subsystems. The menus that a given user or group of users sees on the Gateway can be tailored to create a customized environment.
4.8 TOME SEARCHER and IMIS
TOME SEARCHER is microcomputer software that seeks to provide the inexperienced on-line user with a series of facilities : choice of database(s) in relation to the subject of a search; guidance in formulating the scope of the search; natural language input of the search topic; guidance in clarifying and/or amplifying the topic; automatic conversion of the topic into a Boolean search statement; automatic inclusion of synonyms and spelling variants in the search statement; estimate of likely yield of a search statement; and guidance in narrowing or broadening the statement if the estimated yield does not match the output specified by the user. All this takes place off-line. The system continues by providing automatic dial-up, automatic transmission of search statements to the host using the appropriate command language, display of dialogue with the host, automatic downloading of search output, and the ability to browse through the downloaded records. Much implementation of TOME SEARCHER is customized to a particular subject area, such as electrical and electronics engineering. TOME SEARCHER is one component of the more ambitious IMIS project to develop an intelligent multilingual interface to databases, mounted on an IBM PC and accessing a number of European hosts . IMIS will be designed to support interaction in English, French, German, and Spanish.
Perhaps the best-known front-end is EasyNet, which offers access to multiple databases on 13 hosts, including many science and technology databases . It gives searchers the option of selecting a database themselves or allowing EasyNet to do so based on answers to a series of questions related to the subject and type of material required. Searching can be accomplished using menus to assist in constructing a search strategy or with commands based on the Common Command Language. Users are responsible for selecting their search terms and also for selecting Boolean logical operators to relate these terms. EasyNet translates the strategy into the command language of the host selected and logs on. After the search is completed and the data downloaded to EasyNet's computer, the user is logged off from the host. On-line help from professional reference staff is available by typing SOS. A customized version of EasyNet is marketed by BIOSIS as the Life Science Network, providing access to more than 80 databases . Dyckman and O'Connor  report the results of a study analysing user problems handled by the SOS help service. Their analysis revealed that users seeking human help found the front-end's assistance inadequate in wording their search statements, using features of a specific database, or deciding which database to use.
4.10 Wide Area Information Servers
The Wide Area Information Server (WAIS) project seeks to determine whether current technologies can be used to create end-user full-text information systems . The WAIS system is composed of three separate parts: clients, servers, and the protocol (Z39.50) that connects them. The client is the user interface, the server does the indexing and retrieval of documents, and the protocol is used to transmit the queries and responses. Questions are formulated as English-language queries, which are then translated into the WAIS protocol and transmitted to a server that translates the encoded query into its own query language and then searches for documents satisfying the query. The list of relevant documents is then encoded in the protocol and transmitted back to the client, where they are decoded and the results displayed. The user may modify the query or mark some of the retrieved documents as being relevant. The system can then attempt to find other documents that are similar to those judged relevant. A single interface provides access to many different information sources. With WAIS, the user may select multiple sources to query for information. The system automatically asks all the servers for the required information with no further interaction necessary by the user. The documents retrieved are sorted and consolidated in a single place, to be easily manipulated by the user. To support selection of databases, an on-line Directory of Servers is maintained. It can be queried to identify potential sources on a topic.