Effective Document Presentation with a
Locality-Based Similarity Heuristic
Owen de Kretser
Department of Computer Science and Software Engineering,
The University of Melbourne,
Parkville 3052, Australia.
Alistair Moffat
Department of Computer Science and Software Engineering,
The University of Melbourne,
Parkville 3052, Australia.
Status
Proc. 22nd Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval,
San Francisco, August 1999, 113-120.
Abstract
The heuristics employed in information retrieval systems
have traditionally been document-based, and have judged similarity
holistically based upon entire documents.
In this work we present a locality-based paradigm for information
retrieval, in which every word location in each document is scored.
The locality-based similarity heuristic provides retrieval
effectiveness as good as the document-based technique, and has the
additional advantage of allowing the matching section or sections of
retrieved documents to be shown to the user when they are sifting the
results of their query.
This is a considerable improvement upon the conventional presentation
mechanism, in which the user must manually search each document for
the passage -- if any such passage exists at all -- that suggested to
the retrieval mechanism that this document is an answer.
We also describe an improved index representation that supports the required
operations.