How Effective are Proximity Scores in Term Dependency Models?

Xiaolu Lu
School of Computer Science and Information Technology, RMIT University, Victoria 3001, Australia.

Alistair Moffat
Department of Computing and Information Systems, The University of Melbourne, Victoria 3010, Australia.

Shane Culpepper
School of Computer Science and Information Technology, RMIT University, Victoria 3001, Australia.

Status

Proc. 19th Australasian Document Computing Symp., Melbourne, December 2014, pages 89-92.

Abstract

The dominant retrieval models in information retrieval systems today are variants of tf.idf, and typically use bag-of-words processing in order to balance recall and precision. However, the size of collections continues to increase, and the number of results produced by these models exceeds the number of documents that can be reasonably assessed. To address this need, researchers and commercial providers are now looking at more expensive computational models to improve the quality of the results returned. One such method is to incorporate term proximity into the ranking model. We explore the effectiveness gains achievable when term proximity is a factor used in ranking algorithms, and explore the relative effectiveness of several variants of the term dependency model. Our goal is to understand how these proximity-based models improve effectiveness.

Full text

http://dx.doi.org/10.1145/2682862.2682876 .