Integrated Impacts for Web Retrieval


Vo Ngoc Anh
Department of Computer Science and Software Engineering, The University of Melbourne, Victoria 3010, Australia.

Alistair Moffat
Department of Computer Science and Software Engineering, The University of Melbourne, Victoria 3010, Australia.


Status

Proc. Australian Document Computing Symposium, Canberra, December 15, 2003, pages 25-30.

Abstract

Traditional approaches to information retrieval calculate similarity scores based entirely upon the words and phrases present in the various items of text being manipulated. In the web retrieval domain additional sources of information are available, and can be used to guide answer selection. However, the web domain is also more complex, in that there are a range of tasks that might be performed, and no clear indication as to which task or tasks the user may have had in mind when they issued their query. In this paper, we continue to explore the use of impact-based retrieval, using a simple heuristic for assessing the importance of each term in a document. We extend our previous results to the web domain, and consider how best to address the named/home page finding task, and the topic distillation task, within the confines of a single retrieval system.