Arithmetic Coding Revisited

Alistair Moffat
Department of Computer Science, The University of Melbourne, Parkville 3052, Australia.

Radford Neal
Departments of Statistics and Computer Science, University of Toronto, Canada.

Ian H. Witten
Department of Computer Science, University of Waikato, New Zealand.

Status

ACM Trans. Information Systems, 16(3):256-294, July 1998.

Abstract

Over the last decade, arithmetic coding has emerged as an important compression tool. It is now the method of choice for adaptive coding on multi-symbol alphabets because of its speed, low storage requirements, and efficiency of compression. This paper describes a new implementation of arithmetic coding which incorporates several improvements over a widely-used earlier version (Comm. ACM, 30(6):520--541, June 1987) that has become a de facto standard. These improvements include fewer multiplicative operations, greatly extended range of alphabet sizes and symbol probabilities, and the use of low precision arithmetic (which permits implementation with fast shift/add operations). We also describe a modular structure that divorces the coder from the modeling and probability estimation components of a compression system. To motivate the improved coder we consider the needs of a word-based text compression program, and report a range of experimental results. Complete source code is available.

Full text

http://doi.acm.org/10.1145/290159.290162 .

Source Code

Version 1 of the new code (as described in the DCC'95 extended abstract) is now defunct. Version 2 of the coder, dated August 1996 was extended to accompany the full paper described here. The third version of the code is the current one (February 1999). The differences between versions 2 and 3 are explained in a further paper. All three versions of the software now reside at http://people.eng.unimelb.edu.au/ammoffat/arith_coder/.

The CACM code is located at ftp://ftp.cpsc.ucalgary.ca/pub/projects/ar.cod/cacm-87.shar. Another version of the low precision coder that contains incremental calculation of the decoder target (but not most of the other improvements contained in our package) is at ftp://ftp.cs.toronto.edu/pub/radford/lowp_ac.shar.

The code was written primarily by John Carpinelli, Wayne Salamonsen, and Lang Stuiver.