EvaluatIR: an online tool for evaluating and comparing IR systems
Timothy G. Armstrong
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Alistair Moffat
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
William Webber
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Justin Zobel
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Status
Proc. 32nd Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval, Boston,
July 2009, page 833.
Demonstration presentation.
Abstract
A fundamental goal of information retrieval research is to develop
new retrieval techniques, and to demonstrate that they attain
improved effectiveness compared to their predecessors.
To quantitatively compare IR techniques, the community has developed
a range of standard corpora of documents, queries, and relevance
judgements.
We have developed a centralized mechanism for authenticating new
similarity techniques that have been tested on a standard corpus.
Our website provides an independent, permanent, certified
measure of effectiveness that can be relied on by both authors and
subsequent readers.
Researchers seeking a comparison upload runs via the browser-based
interface, and the website returns a link to a page with performance
results and statistical comparisons
to baselines, using measures such as MAP, nDCG, and RBP, and techniques
such as longitudinal standardization.
By comparing against standard baselines and up-to-date runs submitted
by others, researchers can determine whether their methods provide a
true improvement over earlier work, and readers and referees can more
easily assess claimed results.
Software
EvaluatIR web site.
Full text
http://doi.acm.org/10.1145/1571941.1572153
.