Enhanced Byte Codes with Restricted Prefix Properties
J. Shane Culpepper
NICTA Victoria Laboratory,
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Alistair Moffat
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Status
Proc. 12th Int. Symp. String Processing and Information Retrieval
Buenos Aires, Argentina, November 2005, pages 1-12.
LNCS volume 3772.
Abstract
Byte codes have a number of properties that make them attractive for
practical compression systems: they are relatively easy to construct;
they decode quickly; and they can be searched using standard
byte-aligned string matching techniques.
In this paper we describe a new type of byte code in which the first
byte of each codeword completely specifies the number of bytes that
comprise the suffix of the codeword.
Our mechanism gives more flexible coding than previous constrained
byte codes, and hence better compression.
The structure of the code also suggests a heuristic approximation
that allows savings to be made in the prelude that describes the
code.
We present experimental results that compare our new method with
previous approaches to byte coding, in terms of both compression
effectiveness and decoding throughput speeds.