SIGIR'98 papers: Resolving Ambiguity for Cross-language Retrieval

Resolving Ambiguity for Cross-language Retrieval


Lisa Ballesteros
Center for Intelligent Information Retrieval, Computer Science Department, The University of Massachusetts, Amherst, MA 01003-4610 USA. email: balleste@cs.umass.edu

W. Bruce Croft
Center for Intelligent Information Retrieval, Computer Science Department, The University of Massachusetts, Amherst, MA 01003-4610 USA. email: croft@cs.umass.edu


Abstract

One of the main hurdles to improved CLIR effectiveness is resolving ambiguity associated with translation. Availability of resources is also a problem. First we present a technique based on co-occurrence statistics from unlinked corpora which can be used to reduce the ambiguity associated with phrasal and term translation. We then combine this method with other techniques for reducing ambiguity and achieve more than 90\% monolingual effectiveness. Finally, we compare the co-occurrence method with parallel corpus and machine translation techniques and show that good retrieval effectiveness can be achieved without complex resources.


SIGIR'98
24-28 August 1998
Melbourne, Australia.
sigir98@cs.mu.oz.au.