|Expanding Access to Science and Technology (UNU, 1994, 462 pages)|
|Session 3: New technologies and media for information retrieval and transfer|
|Information retrieval: Theory, experiment, and operational systems|
Received wisdom in the 1950s was that IR required some kind of formal indexing or coding scheme. (Exactly what kind of scheme was one of the topics of endless debate.) Thus, items required indexing/coding in terms of the scheme, which probably in turn required a human indexer (though a machine might be taught to do it). A similar process was required at the search stage, in respect of the query or need, though an end-user might possibly learn enough about the formal scheme to conduct a satisfactory search.
The results of the early experiments, together with the developing technology and changing perceptions of how it might be used, caused a backlash against this received wisdom. It became feasible to throw into the computer larger and larger quantities of text, and retrieve on the basis of words in text rather than assigned keywords or codes. At first sight, the necessity for any kind of indexing scheme, at either end of the process, seems to disappear: the user can use "natural" language to search a "natural" language database, without any interference from librarians.
We have since come to a much more balanced view of language, though the debate continues to generate new questions as we develop our highly interactive systems. It is clear that natural language searching is a powerful device that can often produce good results economically. However, it places a large burden on the searcher, and besides, certain kinds of queries are not well served. In recognition of these points, many modern databases include both formal indexing and searchable natural language text.
Formal artificial languages (in which category I include library classification schemes) represent particular views of the structure and organization of knowledge. One idea that emerged from the analysis of such languages, and that is central to modern indexing languages as well as to the practice of searching, is that of the facet. Once it is recognized that topics and problem areas are potentially highly complex, it becomes essential to approach the problem of describing them via different aspects, or facets, and combining the resulting descriptions in a building-block fashion . (The idea of a faceted classification scheme, while originally due to Ranganathan in the 1930s, was put in its most concise form by Vickery; B.C. Vickery, Faceted Classification, London: Aslib, 1960.)
Many modern indexing languages, while not necessarily following the rules of faceted classification, reflect an essentially facet-based approach to the organization of knowledge. But the approach also has value at the searching stage, whether or not the database being searched is indexed by such a language. This theme is taken up again below.