Offline Dictionary-Based Compression
N. Jesper Larsson
Department of Computer Science,
Department of Computer Science and Software Engineering,
The University of Melbourne,
Parkville 3052, Australia.
Proc. IEEE, 88(11):1722-1732, November 2000.
Dictionary-based modelling is a mechanism used in many practical
In most implementations of dictionary-based compression the encoder
operates online, incrementally inferring its dictionary of available
phrases from previous parts of the message.
An alternative approach is to use the full message to infer a
complete dictionary in advance, and include an explicit
representation of the dictionary as part of the compressed message.
In this investigation we develop a compression scheme which is a
combination of a simple but powerful phrase derivation method and a
compact dictionary encoding.
The scheme is highly efficient, particularly in decompression, and
has characteristics that make it a favourable choice when compressed
data is to be searched directly.
We describe data structures and algorithms that allow our mechanism
to operate in linear time and space.
Software implementing Re-Pair, Des-Pair, and Re-Store is available