SIGIR'98 papers: Improving Automatic Query Expansion
Improving Automatic Query Expansion
Mandar Mitra
Cornell University,
Ithaca, NY 14853.
Amit Singhal
AT&T Labs--Research,
Florham Park, NJ 07932.
Chris Buckley
Sabir Research Inc.,
Gaithersburg, MD 20878.
Abstract
Most casual users of IR systems type short queries. Recent research
has shown that adding new words to these queries via adhoc
feedback improves the retrieval effectiveness of such queries. We
investigate ways to improve this query expansion process by refining
the set of documents used in feedback. We start by using manually
formulated Boolean filters along with proximity constraints. Our
approach is similar to the one proposed by Hearst [HEARST96].
Next, we investigate a completely automatic method that makes use of
term cooccurrence information to estimate word
correlation. Experimental results show that refining the set of
documents used in query expansion often prevents the query drift
caused by blind expansion and yields substantial improvements in
retrieval effectiveness, both in terms of average precision and
precision in the top twenty documents. More importantly, the fully
automatic approach developed in this study performs competitively with
the best manual approach and requires little computational overhead.
SIGIR'98
24-28 August 1998
Melbourne, Australia.
sigir98@cs.mu.oz.au.