In Search of Reliable Retrieval Experiments


William Webber
Department of Computer Science and Software Engineering, The University of Melbourne, Victoria 3010, Australia.

Alistair Moffat
Department of Computer Science and Software Engineering, The University of Melbourne, Victoria 3010, Australia.


Status

Proc. 10th Australasian Document Computing Symposium, Sydney, Australia, December 2005, pages 26-33.

Abstract

There are several ways in which an ``improved'' technique for solving some computational problem can be defended: by mathematical argument; by simulation; and by experimental validation. Each of these has risks. In this paper we describe some of the issues that arose during an experimental validation of architectures for distributed text query evaluation, and the approaches that were taken to resolve them. In particular, collections and clusters must be scaled in a way that maximizes comparability between different data sizes; query sets must be appropriate to the target collection; and hardware issues such as file placement on disk must also be considered. Our intention is to report on our experience in a practical sense, and thereby assist others to avoid the same problems.

Full text