Enhanced Word-Based Block-Sorting Text Compression
R. Yugo Kartono Isal
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Alistair Moffat
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Alwin C. H. Ngai
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Status
Proc. 25th Australasian Computer Science Conference,
Melbourne, January 2002, pages 129-138.
Abstract
The Block Sorting process of Burrows and Wheeler can be applied to any
sequence in which symbols are (or might be) conditioned upon each
other.
In particular, it is possible to parse text into a stream of words,
and then employ block sorting to identify and so exploit any
conditioning relationships between words.
In this paper we build upon the previous work of two of the authors,
describing several further recency rank transformations, and
considering also the role of the entropy coder.
By combining the best of the new recency transformations with an
entropy coder that conditions ranks upon gross characteristics of
previous ones, we are able to obtain improved compression on typical
text files.