On the Implementation of Minimum-Redundancy Prefix Codes

Alistair Moffat
Department of Computer Science, The University of Melbourne, Parkville 3052, Australia.

Andrew Turpin
Department of Computer Science, The University of Melbourne, Parkville 3052, Australia.

Status

Presented at the 1996 IEEE Data Compression Conference, Snowbird, Utah, April 1996, pages 170-179. Slightly extended version available as Technical Report TR 95/38.

Abstract

Minimum-redundancy coding (also known as Huffman coding) is one of the enduring techniques of data compression. In this paper we examine how best minimum-redundancy coding can be implemented, with particular emphasis on the situation when $n$ is large, perhaps of the order of $10^6$. We review a number of recent and not-so-recent techniques for devising minimum-redundancy codes, and consider in detail how encoding and decoding should be accomplished. In particular, we describe a modified decoding method that allows extremely fast decoding, requiring just a few machine operations per output symbol (rather than for each decoded bit), and uses just a few hundred bytes of memory above and beyond the space required to store an enumeration of the source alphabet.