Does Selective Search Benefit from WAND Optimization?
Yubin Kim
Language Technologies Institute,
Carnegie Mellon University,
Pittsburgh, PA 15213, USA
James P. Callan
Language Technologies Institute,
Carnegie Mellon University,
Pittsburgh, PA 15213, USA
Shane Culpepper
School of Computer Science and Information Technology,
RMIT University,
Victoria 3001, Australia.
Alistair Moffat
Department of Computing and Information Systems,
The University of Melbourne,
Victoria 3010, Australia.
Status
Proc. 38th European Conf. on Information Retreival,
Padova, Italy, April 2016, pages 145-158.
Abstract
Selective search is a distributed retrieval technique that reduces
the computational cost of large-scale information retrieval.
By partitioning the collection into topical shards, and using a
resource selection algorithm to identify a subset of shards to search,
selective search allows retrieval effectiveness to be maintained
while evaluating fewer postings, often resulting in
90+% reductions in querying cost.
However, there has been only limited attention given to the
interaction between dynamic pruning algorithms and topical
index shards.
We demonstrate that the WAND dynamic pruning algorithm is more
effective on topical index shards than it is on randomly-organized
index shards, and that the savings generated by selective search and
WAND are additive.
We also compare two methods for applying WAND to topical shards:
searching each shard with a separate top-k heap and threshold; and
sequentially passing a shared top-k heap and threshold from one
shard to the next, in the order established by a resource selection
mechanism.
Separate top-k heaps provide low query latency,
whereas a shared top-k heap provides higher throughput.
Full text
http://dx.doi.org/10.1007/978-3-319-30671-1_11
.