String Search Experimentation Using Massive Data
Alistair Moffat
Department of Computing and Information Systems,
The University of Melbourne,
Victoria 3010, Australia.
Simon Gog
Department of Computing and Information Systems,
The University of Melbourne,
Victoria 3010, Australia.
Status
Philosophical Transactions of the Royal Society A,
372(8), March 2014.
Abstract
Descriptions of new string search or indexing algorithms are often
accompanied by an experimental evaluation.
In this article, we provide guidance as to how such investigations
can be carried out, drawing on our experience of measurement in this
field.
In particular, we describe methodologies for stratifying patterns
according to their length and frequency, so that precise
response-time measurements can be made; and we describe a metric for
categorizing the extent of "repetitiveness" in a text, so that
dataset type can also be factored into evaluations.
We show that separating these concepts allows a greater understanding
of the behaviour of string search algorithms.
Full text
http://dx.doi.org/10.1098/rsta.2013.0135