In Search of Reliable Retrieval Experiments
William Webber
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Alistair Moffat
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Status
Proc. 10th Australasian Document Computing Symposium,
Sydney, Australia, December 2005, pages 26-33.
Abstract
There are several ways in which an ``improved'' technique for solving
some computational problem can be defended: by mathematical argument;
by simulation; and by experimental validation.
Each of these has risks.
In this paper we describe some of the issues that arose during an
experimental validation of architectures for distributed text query
evaluation, and the approaches that were taken to resolve them.
In particular, collections and clusters must be scaled in a way that
maximizes comparability between different data sizes; query sets must
be appropriate to the target collection; and hardware issues such as
file placement on disk must also be considered.
Our intention is to report on our experience in a practical sense,
and thereby assist others to avoid the same problems.