On the Implementation of Minimum-Redundancy Prefix Codes

Alistair Moffat
Department of Computer Science, The University of Melbourne, Parkville 3052, Australia.

Andrew Turpin
Department of Computer Science, The University of Melbourne, Parkville 3052, Australia.

Status

IEEE Trans. Communications, 45(10):1200-1207, October 1997.

Abstract

Minimum-redundancy coding (also known as Huffman coding) is one of the enduring techniques of data compression. Many efforts have been made to improve the efficiency of minimum-redundancy coding, the majority based on the use of improved representations for explicit Huffman trees. In this paper we examine how minimum-redundancy coding can be implemented efficiently by divorcing coding from a code tree, with emphasis on the situation when $n$ is large, perhaps of the order of $10^6$. We review techniques for devising minimum-redundancy codes, and consider in detail how encoding and decoding should be accomplished. In particular, we describe a modified decoding method that allows improved decoding speed, requiring just a few machine operations per output symbol (rather than for each decoded bit), and uses just a few hundred bytes of memory above and beyond the space required to store an enumeration of the source alphabet.

Related paper

In a related paper we examined the issues associated with using a Huffman code to handle non-dense source alphabets, and how to deal with long messages in a pseudo-adaptive manner.