Offline Dictionary-Based Compression
N. Jesper Larsson
Department of Computer Science,
Lund University,
Sweden.
Alistair Moffat
Department of Computer Science and Software Engineering,
The University of Melbourne,
Parkville 3052, Australia.
Status
Proc. IEEE, 88(11):1722-1732, November 2000.
Abstract
Dictionary-based modelling is a mechanism used in many practical
compression schemes.
In most implementations of dictionary-based compression the encoder
operates online, incrementally inferring its dictionary of available
phrases from previous parts of the message.
An alternative approach is to use the full message to infer a
complete dictionary in advance, and include an explicit
representation of the dictionary as part of the compressed message.
In this investigation we develop a compression scheme which is a
combination of a simple but powerful phrase derivation method and a
compact dictionary encoding.
The scheme is highly efficient, particularly in decompression, and
has characteristics that make it a favourable choice when compressed
data is to be searched directly.
We describe data structures and algorithms that allow our mechanism
to operate in linear time and space.
Software
Software implementing Re-Pair, Des-Pair, and Re-Store is available
from
http://www.bic.kyoto-u.ac.jp/pathway/rwan/software/restore.html.