Principles for Robust Evaluation Infrastructure
Justin Zobel
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
William Webber
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Mark Sanderson
School of Computer Science and Information Technology,
RMIT University,
Victoria 3001, Australia.
Alistair Moffat
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Status
Proc. DESIRE Workshop on Data Infrastructures
for Supporting Information Retrieval Evaluation,
Glasgow, October 2011, pages 3-6.
Abstract
The standard "Cranfield" approach to the evaluation of information
retrieval systems has been used and refined for nearly fifty years,
and has been a key element in the development of large-scale
retrieval systems.
The resources created by such systematic evaluations have enabled
thorough retrospective investigation of the strengths and limitations
of particular variants of this evaluation approach; over the last few
years, such investigation has for example led to identification of
serious flaws in some experiments.
Knowledge of these flaws can prevent their perpetuation into future
work and informs the design of new experiments and infrastructures.
In this position statement we briefly review some aspects of
evaluation and, based on our research and observations over the last
decade, outline some principles on which we believe new
infrastructure should rest.
Full text
http://doi.acm.org/10.1145/2064227.2064247
.