Retrieval Consistency in the Presence of Query Variations

Peter Bailey
Microsoft Research, Canberra, Australia.

Alistair Moffat
School of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.

Falk Scholer
School of Computer Science and Information Technology, RMIT University, Victoria 3001, Australia.

Paul Thomas
Microsoft Research, Canberra Australia.


Proc. 40th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, Tokyo, Japan, August 2017, pages 395-404.


A search engine that can return the ideal results for a person's information need, independent of the specific query that is used to express that need, would be preferable to one that is overly swayed by the individual terms used; search engines should be consistent in the presence of syntactic query variations responding to the same information need. In this paper we examine the retrieval consistency of a set of five systems responding to syntactic query variations over one hundred topics, working with the UQV100 test collection, and using Rank-Biased Overlap (RBO) relative to a centroid ranking over the query variations per topic as a measure of consistency. We also introduce a new data fusion algorithm, Rank-Biased Centroid (RBC), for constructing a centroid ranking over a set of rankings from query variations for a topic. RBC is compared with alternative data fusion algorithms. Our results indicate that consistency is positively correlated to a moderate degree with "deep" relevance measures. However, it is only weakly correlated with "shallow" relevance measures, as well as measures of topic complexity and variety in query expression. These findings support the notion that consistency is an independent property of a search engine's retrieval effectiveness.

Full text