NICTA I2D2 Group at GeoCLEF 2006
Yi Li
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Nicola Stokes
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Lawrence Cavedon
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Alistair Moffat
Department of Computer Science and Software Engineering,
The University of Melbourne,
Victoria 3010, Australia.
Status
Proc. GeoCLEF Workshop on Geo-Spatial IR,
Alicante, Spain, September 2006.
Abstract
We report on the experiments undertaken by the NICTA I2D2 Group as
part of GeoCLEF 2006.
We experimented with geographic-based query expansion, using a
gazetteer to extend geospatial terms to ``nearby'' locations, and
included sublocations.
The processing pipeline of the geographic information retrieval
system included: a named entity recognition system for identifying
location names; a toponym resolution component for assigning
probabilistic likelihoods to geographic candidates obtained from a
gazetteer (the Getty Thesaurus); and a probabilistic approach to
Geographic Information Retrieval.
We experimented with approaches involving expanding location names in
both documents and queries.
We used a normalization process to adjust term weights to ensure that
geographic terms added to a query do not overwhelm the contribution
of non-geographic query terms.
We submitted five runs to the English-only GeoCLEF monolingual task,
ranging from a baseline task of text-only retrieval based on topic
title and description, to queries expanded using gazetteer-based
toponym resolution.
Our submitted runs showed little improvement for GIR runs over the
baseline run.
A refinement to the normalization process (post-submission) resulted
in GIR runs showing 6.57% and 5.84% improvement over the baseline in
overall MAP.