Index Compression using Fixed Binary Codewords
Vo Ngoc Anh
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Alistair Moffat
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Status
Proc. 15th Australasian Database Conference,
Dunedin, New Zealand, January 2004, pages 61-67.
Abstract
Document retrieval and web search engines index large quantities of
text.
The static costs associated with storing the index can be traded
against dynamic costs associated with using it during query
evaluation.
Typically, index representations that are effective and obtain good
compression tend not to be efficient, in that they require more
operations during query processing.
In this paper we describe a scheme for compressing lists of
integers as sequences of fixed binary codewords that has the twin
benefits of being both effective and efficient.
Experimental results are given on several large text collections to
validate these claims.