Justin Zobel
Department of Computer Science,
RMIT, GPO Box 2476V,
Melbourne 3001, Australia.
Alistair Moffat
Department of Computer Science,
The University of Melbourne,
Parkville 3052, Australia.
Our purpose with this work is threefold: first, in enumerating the various measures in an orthogonal framework we make it straightforward for other researchers to describe and discuss similarity measures; second, by experimenting with a wide range of the measures, we hope to observe which features yield good retrieval behaviour in a variety of retrieval environments; and third, by describing our results so far, to gather feedback on the issues we have uncovered. We demonstrate that it is surprisingly difficult to identify which techniques work best, and comment on the experimental methodology required to support any claims as to the superiority of one method over another.