Inverted Files for Text Search Engines
Justin Zobel
School of Computer Science and Information Technology,
RMIT University,
Victoria 3001, Australia.
Alistair Moffat
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Status
ACM Computing Surveys, 38(2):6.1-6.56, 2006.
Abstract
The technology underlying text search engines has advanced
dramatically in the past decade.
The development of a family of new index representations has led to a
wide range of innovations in index storage, index construction, and
query evaluation.
While some of these developments have been consolidated in textbooks,
many specific techniques are not widely known or the textbook
descriptions are out of date.
In this tutorial we introduce the key techniques in the area,
describing both a core implementation and how the core can be
enhanced through a range of extensions.
We conclude with a comprehensive bibliography of text indexing
literature.
Published paper
http://doi.acm.org/10.1145/1132956.1132959
Errata
-
On page 21, in Table III, in the b=3 column, the code for 2 should be "0:10" and not "0:01"
(noted by Yann Barsamian).
-
On page 42, second-last line, "Moffa et al." should be
"Moffat et at." (noted by Simon Gog).