UQV: A Test Collection with Query Variability

Peter Bailey
Microsoft, Australia

Alistair Moffat
Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.

Falk Scholer
School of Computer Science and Information Technology, RMIT University, Victoria 3001, Australia.

Paul Thomas
Microsoft, Australia


Proc. 39th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, Pisa, Italy, July 2016, pages 725-738.


We describe a test collection (UQV100) that is designed to incorporate variability from users. One hundred topics (or specific sub-topics) from the 2013 and 2014 TREC Web Tracks were re-purposed via information-need statements (backstories). Crowd workers were then asked to read the backstories, and provide the queries they would use, plus corresponding effort estimates for the number of useful documents needed to satisfy the information need. A total of 10,835 queries were collected from 263 workers. After normalization and spell-correction, 5,764 unique variations remained; these were then used to construct a document pool via Indri-BM25 over the ClueWeb12 corpus. Relevance judgments were made via qualified crowd workers relative to the backstories using a relevance scale similar to the original TREC judging approach, first to a pool depth of ten, and then second, of a further set of targeted documents. The backstories, query variations, spell-corrected queries, effort estimates, run outputs, relevance judgments are made available collectively as the UQV100 test collection, plus the judging guidelines and the gold hits we used for crowd-worker qualification and anti-spam detection. We believe this test collection will unlock new opportunities for novel investigations and analysis, including for problems such as task-intent retrieval performance and consistency (independent of query variation), query clustering, query difficulty prediction, and relevance feedback, among others.

Full text

http://doi.acm.org/10.1145/2911451.2914671 .

Data Resource

http://dx.doi.org/10.4225/49/5726E597B8376 .