|A combined phrase and thesaurus browser for large document collections|
Paynter, G. W., Witten, I. H. (2001) P. Constantopoulos and I. T. Solvberg (eds), Proc Fifth European Conference on Research and Advanced Technology for Digital Libraries (ECDL'01), LNCS 2163, Darmstadt, Germany, 25-36. Springer, Berlin.
A hierarchical browsing interface to a documetn collection can be constructed by identifying the phrases that recur in the full text of the documents and structuring them into a hierarchy. This provided a good way of allowing readers to browse comfortably through the phrases (all phrases) in a large document collection. A subject-oriented thesaurus provides a different kind of hierarchical structure, based on deep knowledge of the subject area. If all documents, or parts of documents, are tagged with thesaurus terms, this provided a very convenient way of browsing through a collection. Unfortunately, manual classification is expensive and infeasible for many practical document collections.