On the Cost of Phrase-Based Ranking
Matthias Petri
Department of Computing and Information Systems,
The University of Melbourne,
Victoria 3010, Australia.
Alistair Moffat
Department of Computing and Information Systems,
The University of Melbourne,
Victoria 3010, Australia.
Status
Proc. 38th Ann. Int. ACM SIGIR Conf. on
Research and Development in Information Retrieval,
Santiago, August 2015, pages 931-934.
Abstract
Effective postings list compression techniques, and the efficiency of
postings list processing schemes such as WAND, have
significantly improved the practical performance of ranked document
retrieval using inverted indexes.
Recently, suffix array-based index structures have been proposed as a
complementary tool, to support phrase searching.
The relative merits of these alternative approaches to ranked
querying using phrase components are, however, unclear.
Here we provide: (1) an overview of existing phrase indexing
techniques; (2) a description of how to incorporate recent advances
in list compression and processing; and (3) an empirical evaluation
of state-of-the-art suffix-array and inverted file-based phrase
retrieval indexes using a standard IR test collection.
Full text
http://dx.doi.org/10.1145/2766462.2767769
.
Software
https://github.com/mpetri/pos-cmp.