Measurement Techniques and Caching Effects
Stefan Pohl
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Alistair Moffat
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Status
Proc. 31st European Conference on Information Retrieval,
Toulouse, France, April 2009, pages 691-695, LNCS volume 5478.
Poster paper.
Abstract
Overall query execution time consists of the time spent transferring
data from disk to memory, and the time spent performing actual
computation.
In any measurement of overall time on a given hardware configuration,
the two separate costs are aggregated.
This makes it hard to reproduce results and to infer which of the two
costs is actually affected by modifications proposed by researchers.
In this paper we show that repeated submissions of the same query
provides a means to estimate the computational fraction of overall
query execution time.
The advantage of separate measurements is exemplified for a
particular optimization that is, as it turns out, reducing
computational costs only.
Finally, by exchange of repeated query terms with surrogates that
have similar document-frequency, we are able to measure the natural
caching effects that arise as a consequence of term repetitions in
query logs.
Full text
http://dx.doi.org/10.1007/978-3-642-00958-7_68