4. Early experiments in IR

The idea of the experimental evaluation of IR systems is central to both theory and operational system development. Perhaps surprisingly, this idea is only about 35 years old. (Admittedly, 35 years is a long time in the history of computer-based IR systems; but some kinds of IR system, such as library classification schemes, predate computers by at least two-and-a-half millennia!)

We are not so much concerned here with whether the system works in a technical or physical sense as with something that might now be described as its cognitive functioning. In other words, the question as to whether the system will succeed in locating items with specific characteristics (words, codes) is not generally at issue. The question is, Do those items with the specific characteristics actually serve the information need or resolve the ASK? This will depend, in general, on the ways the system offers of specifying characteristics. In the earliest experiments, the question took the form: Does the system retrieve the "correct" answer in response to a query? Setting aside the problem of ASKs and relevance, the implied model of the IR system was what might be described as an input-output model - feed in the query, get out the answer. It was a model that fit well with the early computer-based systems. (Actually, they would very likely be human-assisted; the searcher would send the query off to a library or information centre, where an expert would formulate it in system terms, run the search, and return the results.) In retrospect, however, it seems like a temporary aberration. Both older systems (card catalogues, printed indexes, etc.) and newer ones (highly interactive on-line systems) exhibit characteristics that do not fit too well with the input-output model, particularly if the searcher is the end-user, that is the person needing the information.

The early experiments told us a little about the design of IR systems, but they also focused attention in certain areas, and it may be argued that their lasting influence lies in this focusing process. One area is the one already mentioned, that of relevance: the necessity to devise an operational definition of a "correct" answer was a major stimulus to the reconsideration of the notion of relevance. A second kind of focus was on the particular aspects of IR system design that seemed important. A number of such aspects that had been endlessly debated in the 1940s and 1950s now seemed to be of relatively minor importance; by contrast, some aspects that had received little consideration now became central. One of the later outcomes of this process, as we shall see, has been the concern with highly interactive systems.