Models and Metrics: IR Evaluation as a User Process
Alistair Moffat
Department of Computing and Information Systems,
The University of Melbourne,
Victoria 3010, Australia.
Falk Scholer
School of Computer Science and Information Technology,
RMIT University,
Victoria 3001, Australia.
Paul Thomas
ICT Centre, CSIRO, Canberra, Australia.
Status
Proc. 17th Australasian Document Computing Symp.,
Dunedin, New Zealand, December 2012, pages 47-54.
Abstract
Retrieval system effectiveness can be measured in two quite different
ways: by monitoring the behavior of users and gathering data about
the ease and accuracy with which they accomplish certain specified
information-seeking tasks; or by using numeric effectiveness metrics
to score system runs in reference to a set of relevance judgments.
The former has the benefit of directly assessing the actual goal of
the system, namely the user's ability to complete a search task;
whereas the latter approach has the benefit of being quantitative and
repeatable.
Each given effectiveness metric is an attempt to bridge the gap
between these two evaluation approaches, since the implicit belief
supporting the use of any particular metric is that user task
performance should be correlated with the numeric score provided by
the metric.
In this work we explore that linkage, considering a range of
effectiveness metrics, and the user search behavior that each of them
implies.
We then examine more complex user models, as a guide to the
development of new effectiveness metrics.
We conclude by summarizing an experiment that we believe will help
establish the strength of the linkage between models and metrics.
Full text
http://doi.acm.org/10.1145/2407085.2407092
Errata
The equation for L_{AP} at the top of the fourth page (page 50 in the
printed proceedings) is incorrect.
The best formulation of AP (using the described framework of W, C,
and L) is as:
C_{AP}(i) = \frac{
\sum_{j={i+1}}^{\infty} (r_j/j)
}{
\sum_{j=i}^{\infty} (r_j/j)
}
We got one step closer in
CIKM 2013, but note that
we didn't quite get it right there either.